F.R. Gantmacher 


Theory OF 
MATRICES 


Volume 2 


iI Oo 9 
01 9 
0 0 90 


THE THEORY OF 


MATRICES 


BY 


F. R. GANTMACHER 


VOLUME TWO 


1959 


: PREFACE 


THE MATRIX CALCULUS is widely applied nowadays in various branches of 
mathematics, mechanics, theoretical physics, theoretical electrical engineer- 
ing, etc. However, neither in the Soviet nor the foreign literature is there a 
book that gives a sufficiently complete account of the problems of matrix 
theory and of its diverse applications. The present book is an attempt to fill 
this gap in the mathematical literature. 

The book is based on lecture courses on the theory of matrices and its 
applications that the author has given several times in the course of the last 
seventeen years at the Universities of Moscow and Tiflis and at the Moscow 
Institute of Physical Technology. 

The book is meant not only for mathematicians (undergraduates and 
research students) but also for specialists in allied fields (physics, engi- 
neering) who are interested in mathematics and its applications. Therefore 
the author has endeavoured to make his account of the material as accessible 
as possible, assuming only that the reader is acquainted with the theory of 
determinants and with the usual course of higher mathematics within the 
programme of higher technica] education. Only a few isolated sections in 
the last chapters of the book require additional mathematical knowledge on 
the part of the reader. Moreover, the author has tried to keep the indi- 
vidual chapters as far as possible independent of each other. For example, 
Chapter V, Functions of Matrices, does not depend on the material con- 
tained in Chapters II and III. At those places of Chapter V where funda- 
mental concepts introduced in Chapter IV are being used for the fiist time, 
the corresponding references are given. Thus, a reader who is acquainted 
with the rudiments of the theory of matrices can immediately begin with 
reading the chapters that interest him. 

The book consists of two parts, containing fifteen chapters. 

In Chapters I and ITI, information about matrices and linear operators 
is developed ab initio and the connection between operators and matrices 
is introduced. 

Chapter II expounds the theoretical basis of Gauss’s eliminatio.:: method 
and certain associated effective methods of solving a system of m linear 
equations, for large n. In this chapter the reader also becomes acquainted 
with the technique of operating with matrices that are divided into rectan- 
gular ‘blocks.’ : 


1V PREFACE 


In Chapter IV we introduce the extremely important ‘characteristic’ 
and ‘minimal’ polynomials of a square matrix, and the ‘adjoint’ and‘ reduced 
adjoint’ matrices. 

In Chapter V, which is devoted to functions of matrices, we give the 
general definition of f(A) as well as concrete methods of computing it—- 
where f(A) is a function of a scalar argument 4 and .1 iS a square matrix. 
The concept of a function of a matrix is used in S$ 5 and 6 of this chapter 
for a complete investigation of the solutions of a svstem of linear differen- 
tial equations of the first order with constant coefficients. Both the concept 
of a function of a matrix and this latter investigation of differential equa- 
tions are based entirely on the concept of the minimal polynomial of a matrix 
and—in contrast to the usual exposition—-do not nse the so-called theory of 
elementary divisors, which is treated in Chapters VI and VIT. 

These five chapters constitute a first course on matrices and their apphi- 
cations. Very important problems in the theory of matrices arise in con- 
nection with the reduction of matrices to a normal form. This reduction 
ig carried out on the basis of Weierstrass’ theory of elementary divisors. 
In view of the importance of this theory we give two expositions in this 
book: an analvtie one in Chapter VI and a geometric one in Chapter VII. 
We draw the reader’s attention to §8 7 and & of Chapter VI, where we study 
effective methods of finding a matrix that transforms a given matrix to 
normal form. In $8 of Chapter VIT we investigate in detail the method 
of A. N. Krylov for the practical computation of the coefficients of the 
characteristic polynomial. 

In Chapter VIIT certain types of matrix equations are solved. We also 
consider here the problera of determining all the matrices that are permutable 
with a given matrix and we study in detail the many-valued functions of 
matrices ™\/A and InJA. 

Chapters IX and X deal with the theorv of linear operators in a unitary 
space and the theory of quadratic and hermitian forms. These chapters do 
not depend on Weierstrass’ theory of elementary clivisors and use, of the 
preceding material, only the basic information on matrices and linear opera- 
tors contained in the first three chapters of the hook. In §9 of Chapter X 
we apply the theory of forms to the study of the principal oscillations of a 
system with n degrees of freedom. In § 11 of this chapter we give an account 
of Frobenius’ deep results on the theory of Hankel forms. These results are 
used later, in Chapter XV, to study special cases of the Routh-Hurwitz 
problem. 

The last five chapters form the second part of the took [the second 
volume, in the present English translation]. In Chapter XI we determine 
normal forms for complex symmetric, skew-symmetric, and orthogonal mat- 


PREFACE Vv 


rices and establish interesting connections of these matrices with real matrices 
of the same classes and with unitary matrices. 

In Chapter XII we expound the general theory of pencils of matrices of 
the form A +B, where A and B are arbitrary rectangular matrices of the 
Same dimensions. Just as the study of regular pencils of matrices A + AB 
is based on Weierstrass’ theory of elementary divisors, so the study of singu- 
lar pencils is built upon Kronecker’s theory of minimal indices, which is, as 
it were, a further development of Weierstrass’s theory. By means of Kron- 
ecker’s theory—the author believes that he has succeeded in simplifying the 
exposition of this theory—we establish in Chapter XII canonical forms of 
the pencil of matrices A + AB in the most general case. The results obtained 
there are applied to the study of systems of linear differential equetions 
with constant coefficients. 

In Chapter XIII we explain the remarkable spectral properties of mat- 
rices with non-negative elements and consider two important applications 
of matrices of this class: 1) homogeneous Markov chains in the theory of 
probability and 2) oscillatory properties of elastic vibrations in mechanics. 
The matrix method of studying homogeneous Markov chains was developed 
in the book [46] by V. I: Romanovskii and is based on the fact that the matrix 
of transition probabilities in a homogeneous Markov chain with a finite 
number of states is a matrix with non-negative elements of a special type 
(a ‘stochastic’ matrix). 

The oscillatory properties of elastic vibrations are connected with another 
important class of non-negative matrices—the ‘oscillation matrices.’ These 
matrices and their applications were studied by M. G. Krein jointly with 
the author of this book. In Chapter XIII, only certain basic results in this 
domain are presented. The reader can find a detailed account of the whole 
material.in the monograph [17]. 

In Chapter XIV we compile the applications of the theory of matrices 
to systems of differential equations with variable coefficients. The central 
place (§§ 5-9) in this chapter belongs to the theory of the multiplicative 
integral (Produktintegral) and its connection with Volterra’s infinitesimal 
caleulus: These problems are almost entirely unknown in Soviet mathe- 
matical literature. In the first sections and in §11, we study reducible 
systems (in the sense of Lyapunov) in connection with the problem of stabil- 
ity of motion; we also give certain results of N. P. Erugin. Sections 9-11 
refer to the analytic theory of systems of differential equations. Here we 
clarify an inaccuracy in Birkhoff’s fundamental theorem, which is usually 
applied to the investigation of the solution of a system of differential equa- 
tions in the neighborhood of a singular point, and we establish a canonical 
form of the solution in the case of a regular singular point. 


vi PREFACE 


In § 12 of Chapter XIV we give a brief survey of some results of the 
fundamental investigations of I. A. Lappo-Danilevskit on analytic functions 
of several matrices and their applications to differential systems. 

The last chapter, Chapter XV, deals with the applications of the theory 
of quadratic forms (in particular, of Hankel forms) to the Routh-Hurwitz 
problem of determining the number of roots of a polynomial in the right 
half-plane (Rez > 0). The first sectiwns of the chapter contain the classical 
treatment of the problem. In § 5 we give the theorem of A. M. Lyapunov in 
which a stability criterion is set up which is equivalent to the Routh-Hurwitz 
criterion. Together with the stability criterion of Routh-Hurwitz we give, 
in § 11 of this chapter, the comparatively little known criterion of Liénard 
and Chipart in which the number of determinant inequalities is only about 
half of that in the Routh-Hurwitz criterion. 

At the end of Chapter XV we exhibit the close connection between stabil- 
ity problems and two remarkable theorems of A. A. Markov and P. L. 
Chebyshev, which were obtained by these celebrated authors on the basis of the 
expansion of certain continued fractions of special types in series of decreas- 
ing powers of the argument. Here we give a matrix proof of these theorems. 

This, then, is a brief summary of the contents of this book. 


F. R. Gantmacher 


PUBLISHERS’ PREFACE 


TiE PUBLISHERS WISH TO thank Professor Gantmacher for his kindness in 
communicating to the translator new versions of several paragraphs of the 
original Russian-language book. 

The Publishers also take pleasure in thanking the VEB Deutscher Verlag 
der Wissenschaften, whose many published translations of Russian scientific 
books into the German language include a counterpart of the present work, 
for their kind spirit of cooperation in agreeing to the use of their formulas 
in the preparation of the present work. 

No material changes have been made in the text in translating the present 
work from the Russian except for the replacement of several paragraphs by 
the new versions supplied by Professor Gantmacher. Some changes in the 
references and in the Bibliography have been made for the benefit of the 
English-language reader. 


PREFACE saa aad POCRETAdITOS OOTRSLESETTO EOFS OEMEDE F500 09 FOE OEE $OT9E0 COTEET ESE DEE DO EELHEE TEL SLDEDESEOOT PED CEDLIDCRSES CDEC OREO OTOL S 
PUBLISHERS’ PREFACE -wmtemnnmmnnonmnmmenneticnuneninainmtinenisstnennntmttimintnte . vi 
XJ. CoMPLEX SYMMETRIC, SKEW-SYMMETRIC, AND ORTHOGO- 
IAL: RAT IRIE sisssscssinsscs sisseittecestee ce eteseertsrnecbscpccen atest ele aieatalss’ 
§ i. Some formulas for complex orthogonal and unitary 
TOLL ICES cecsscccoecceocerecoseccerneseovesnrscceresesensnsnsconsnsesccessssnsosnaescnsssssnecnuesesnmnagoonsneeseuneeeute . 1 
§ 2. Polar decomposition of a complex matrix... saeco 6 
§ 3. The normal form of a complex SVMMEtLiC MAtr 1X aeeesecermrennrven 9 
§ 4. The normal form of a complex skew-symmetric matrix... 12 
§ 5. The normal form of a complex orthogonal Matr1X—awcn 18 
XII. SINGULAR PENcILS OF MATRICES............ baspatlaes seam OA 
§ 1. Introduction .... sess-noseescensseeeapensseinscnsetomacenseessecsnmosnnsensoneeenetant wn 24 
§ 2. Regular pencils of. TAT ICES. seecrsssesscorssrscneresnsesseremnsrersensen siete: S20) 
§ 3. Singular pencils. The reduction tH COT CM ..nncecessorsreseenrserrevonnnvoninite 29 
§ 4. The canonical form of a singular pencil of MatriCeS...... . 30 
§ 5. The minimal indices of a pencil. Criterion for strong 
equivalence Of Perils -nreeenemeneneernenre ssc itive Gehan tedacecertceettore 37 
6. Singular pencils of quadratic LOTTI nvveserovsoccsresesesreersmnenseresoresonsossonanaes - 40 
§ 7. Application to differential EQUALI ONS sressrrsesvecrsernsonevernecrnroe reenter . 49° 
XIII. Matrices with NON-NEGATIVE ELE MENTS recs cssovsscssssscossrsesereseeetrere w- 50 
§ 1. Geemeral Properties nssessucsnseceseneerssesnenuseeneenervneemsemsmnsenseenmeurestett . 50 
§ 2 Spectral properties of irreducible non-negative matrices... 53 
§ 3 Reeducible matrices scccccccccsccsscccorsccccsecsscescecersceeceecovessoumensemesnsnnesnsnssnnnsnanarsevecee nm 66 
§ 4 The normal form Of a reducible MAatrXrvccsssseeesee--mesceneseneeme 14 
§ 5 PriMitive ANA IMPTiMItive WALTICES..avrerrseccceererseennreescennercestsnsee wenn 80 
§ 6 Stochastic Matrices ccsuecerccnecensernsemoossemessnsensenreeneememennemsnrmmennnannn 82 


CONTENTS 


vii 


iii 


viii 


XIV. 


XV. 


Or (On 


asta 


CONTENTS 
Limiting probabilities for a homogeneous Markov chain 
With a finite NUMbDET Of StAteS reensescremctenccsmneesemmemrennnenmennan BF 
Totally mon-megative matrices cresccccccer css sceenmmusesen seen 98 
Oscillatory Matrices -ccccsescsaceesenneemeenneeen anemia LOS 


APPLICATIONS OF THE THEORY OF MATRICES TO THE INVES- 
TIGATION OF SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 113 


§ 1. 


CO? Gr tor Or ? 


oOo oP OO 


Systems of linear differential equations with variable 


Coefficients. General CONCEPTS .ccsessssserescseesssseusesstsessssneessenesreunerseneesen 113 
Ly APuMOv transformations cerescccserecssccssscscsseusssssscenseresonresssesnnsssonsersesseensseee 116 
Redurcible systems ersescsusssunnssscssessssemssneessumustsemeneenencesmenencenectsssueset 118 
The canonical form of a reducible syste’n. Erugin’s theorem 121 
PG WALTIGA It aici a ath et sh 125 


The multiplicative integral. The infinitesimal calculus of 
WOM GPR casei See eg asia le Se ee ea ee Te 


Differential systems in a complex domain. General prop- 


CP TCS jascriisseectatesct ile ctorlteartcecste eacerese ipceeetcet aes eee ee eet 135 
The multiplicative integral in a complex AOMAIN Avenues 138 
Tsolated singular Points ~....cssssccsscssussssmeneersemsetmnreemasmtnsnnn L42 
Regular singularities ....csecssssssscssnssscesseecssensssensssseequsrerssuneecsenesesesneeeeneneee 148 
Reducible analytic Systems -scsosemesssnessssonsssessnssessssessessesesenssessaeseessssen 164 


Analytic functions of several matrices and their applica- 
tion to the investigation of differential systems. The papers 
Of Lappo-Danilevskit -m-ccnnnseemeinenuensnumntmenmenmnnne 168 


THE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS 172 


Cn Cr 0) (0? 4? 4 U> tr wn 


pt 


oP NOT PB & 


PRU GHO CUCL OTN assess a ca eee tle 172 
CATCH 9) ING ICOS ccs celseh eek Seco elec ena etnaeenaust 173 
FOOUGHS: ALOT TEIN escalate oral cep eae estates ants 177 
The singular case  Bxamples .cscnccsssssnsesscessssarescssemeesemescsenssnseeee 181 
LIVEDUNOV S TNEOLCM: sess tases vse ie crac aise emai ink a erat 185 
The theorem of Routh-Hurwitz on esesssncssssesssssennnensenueessnssesssnsssee 190 
Orlando’ S: 1Orm Ula: ees Scien tien avatar sma ltanehs acest: 196 
Singular cases in the Routh-Hurwitz theorem. unnscecessen 198 


The method of quadratic forms. Determination of the 
number of distinct real roots Of a& POLVNOMIAL A eseenssssssnersssenee 201 


CONTENTS iX 


$10. Infinite Hankel matrices of fimite ram Konmenesnmenmeneumeneene 20 t 
$11. Determination of the index of an arbitrary rational frac- 

tion by the coefficients of numerator and denominator....... 208 

§12. Another proof of the Routh-Hurwitz theorem... 216 
§13. Some supplements to the Routh-Hurwitz theorem. Stabil- 

ity criterion Of Liémard and Chipart.nssscssrsssessarssanentenmneesen 220 


§ 14. Some properties of Hurwitz polynomials. Stieltjes’ theo- 
rem. Representation of Hurwitz polynomials by con- 


LINN FACTION  .oncscscssccsssscsenserernnreenseneceecseensonsennsorecescseencessensssononentstantssunsneonesseesees 225 

815. Domain of stability. Markov parameters... nsec 232 
$16. Connection with the problem Of MOMENtS.1.eecsrsnscncnennssneree 236 
$17. Theorems of Markov and Chebyshev... nsssensssssssnsnsnsrsansensseees . 240 

$18. The generalized Routh-Hurwitz problem... nesses 248 

PR TESTOR Ba VY csscs oss acess cas ssses cass sanrpamcasnsassen tcc auansuaccecceicay teens enaebgncenngpueietiveomarcnnees uiSen 251 


CHAPTER XI 


COMPLEX SYMMETRIC, SKEW-SYMMETRIC, AND 
ORTHOGONAL MATRICES 


In Volume I, Chapter [X, in connection with the study of linear operators 
in a euclidean space, we investigated real symmetric, skew-symmetric, and 
orthogonal matrices, i.e., real square matrices characterized by the relationst. 


St=8, K'=— K, and Q'T=9Q-}, 


respectively (here Q™ denotes the transpose of the matrix Q). We have 
shown that in the field of complex numbers all these matrices have linear 
elementary divisors and we have set up normal forms for them, 1.e., ‘simplest’ 
real symmetric, skew-symmetric, and orthogonal matrices to which arbitrary 
matrices of the types under consideration are real-similar and orthogonally 
similar. 

The present chapter deals with the investigation of complex symmetric, 
skew-symmetric, and orthogonal matrices. We shall elarify the question 
of what elementary divisors these matrices can have and shall set up normal 
forms for them. These forms have a considerably more complicated struc- 
ture than the corresponding normal forms in the real case. Asa preliminary, 
we shall establish in the first section interesting connections between com- 
plex orthogonal and unitary matrices on the one hand, and real symmetric, 
skew-symmetric, and orthogonal matrices on the other hand. 


§ 1. Some Formulas for Complex Orthogonal and Unitary Matrices 


1, We begin with a lemma: _ 
Lemma 1:1 1. If amatria G is both hermitian and orthogonal (G7 =G= 
G-1), then it can be represented in the form 


G=Iek, (1) 
where lis a real symmetric mvolutory matriz and K a real skew-symmetric 
matric permutable with tt: 


1S8ee [169], pp. 223-225. 
+ In this and in the following chapters, a matrix denoted by the letter Q is not neces- 
sarily orthogonal. 
1 


2 XI. Complex SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL MATRICES 
1=1=C,P=E,K=K=—kK™ (2) 


2. If, in addition, G 1s a positiwe-definite hermnitian matriz,® then in 
(1) T=E and 
G=e'*, (3) 
Proof. Let 
G=>S+1T, (4) 


where S and T are real matrices. Then 
G=S —sT and GT = ST 4+ aT. (5) 


Therefore the equation G = G™ implies that S = ST and T =— T", ie., S is 
symmetric and T skew-symmetric. 

Moreover, when the expressions for G and @ from (4) and (5) are sub- 
stituted in the complex equation GG = E, it breaks up into two real equations: 


S*+77=E and ST=TS: (6) 


The second of these equations shows that S and 7 commute. 

By Theorem 12’ of Chapter IX (Vol. I, p. 292), the commuting normal 
matrices S and T can be carried simultaneously into quasi-diagonal form by 
a real orthogonal transformation. Therefore® 


S= Q {s,, 8), 89, 8a, eo ey 8y 85» $09+19 eeey 8,} Q-, 


(Q=Q=Q") (7) 


P= o{ ‘0 : 0 a; O ¢, 0,...,0}Q— 
U4 = —t, 0 
(the numbers s, and 4 are real). Hence 
te] || a, se 8, tt, | 
a : ia ae ; ; ’ ese 1, 
oll 4 —ty 8) ” ||~ tt, 8o —it, 8, ‘|p Saeen tof O- (8) 


On the other hand, when we compare the expressions (7) for S and 7 
with the first of the equations (6), we find: 


2 2 2 a 2 as us 
8; —t = 1, 8,—tg=1,..., s,—t=1, Sy41=— +], eooeg 8, = + 1. (9) 
"Le. G is the coefficient matrix of a positive-definite hermitian form (see Vol. I, 


Chapter X, § 9). 
* See also the Note following Theorem 12’ of Vol. I, Chapter IX (p. 293) 


§ 1. CoMPLEX ORTHOGONAL AND UNITARY MartrRIcEs 3 


Now it is easy to verify that a matrix of the type |e | with s?— #??=] 
can always be represented in the form 


l-» ol 
=gell-9 © 


where 


|s|=coshy, et=sinhp, e=signas. 


Therefore we have from (8) and (9): 


| oe ae a ad 
C=O (ee ee ag be Fe hoy Os (10) 
i.e., 
G=I[e% | 
where 
1=Q(+1,41,...,4)07, 
0 0 
K=91| a eee 70... 0}0- on 
lii—e: 0 —y, 0 
and 
IK = KI. 


From (11) there follows the equation (2). 
2. If, in addition, it is known that @ is a positive-definite hermitian 
iatrix, then we can state that all the characteristic values of G are positive 
(see Volume I, Chapter IX, p. 270). But by (10) these characteristic values 


are 
+e, +e, + es, + eM, ..,, +e%, +oW,+1,...,4 1] 
(here the signs correspond to the signs in (10)). 


Therefore in the formula (10) and the first formula of (11), wherever 
the sign + occurs, the + sign must hold. Hence 


T= Q{1,1,..., 1} Q-*=#, 


and this is what we had to prove. 
This completes the proof of the lemma. 


4 XI. CompLex SYMMETRIC. SKEw-SYMMETRIC, ORTHOGONAL MATRICES 


By means of the lemma we shall now prove the following theorem : 
THEOREM 1: Every complex orthogonal matrix Q can be represented in 
the form | 
Q=Re*, (12) 
where R 1s a reat orthogonal matriz and K a real skew-symmetric matrix 


R=R=R'-), K=K=—K. (13) 


Proof. Suppose that (12) holds. Then 
Qr=Q" =e R’ 
and 


Q*Q =eX RR ReiX = e2*, 


By the preceding lemma the required real skew-symmetric matrix K can 
be determined from the equation 


Q*Q= et (14) 


because the matrix Q*Q is positive-definite hermitian and orthogonal. After 
K has been determined from (14) we can find R from (12): 


R= Qe", (15) 


Then 
R*R=c®Q*Qe*X=— EF; 


ie, Ris unitary. On the other hand, it follows from (15) that A, as the 
product of two orthogonal matrices, is itself orthogonal: R’R=E. Thus 
R is at the same time unitary and orthogonal, and hence real. The formula 
(15) can be written in the form (12). 

This proves the theorem.‘ 

Now we establish the following lemma: 


LemMa2: If a matrix D is both symmetric and unitary (D=DT=D-"), 
then rt can be represented in the form 
D=e', (16) 
where S is a real symmetric matrix (S = 8 = ST). 


+The formula (12), like the polar decomposition of a complex matrix (in connection 
with the formulas (87), (88) on p. 278 of Vol. I) has a close connection with the important 
Theorem of Cartan which establishes a certain representation for the automorphisms of 
the complex Lie groups; see [169], pp. 232-233. 


§ 1. CoMPLEX ORTHOGONAL AND UNITARY MATRICES 5 


Proof. We set sceile 7 
D=U0+1W (U0=U,V=YV). (17) 


Then _ 
D=U—iV, DT=UT4+1.vV". 


The complex equation D = D* splits into the two real equations 
U=U", V=V". 


Thus, U and V are real symmetric matrices. 
The equation DD =£E implies: 


U2+V2=E, UV=VU.- (18) 


By the second of these equations, U and V commute. When we apply 
Theorem 12’ (together with the Note) of Chapter IX (Vol. I, pp. 292-3) 
to them, we obtain: 


U=Q{s,, &,---,8& 39), V=Q{t, te, .--, 6}. (19) 


Here s; and ¢;, (k=1, 2,..., ”) are real numbers. Now the first of the 
equations (18) yields: 


e+@=1. (k=1,2,...,2). 
Therefore there exist real numbers g; (k=1, 2,..., n) such that 
8,=cosg,, t&—sing, (k=—1,2,...,%). 
Substituting these expressions for s; and ¢; in (19) and using (17), we find: 


D=Q {e™, ef?s, eeey efPn} Q-! = eS 
where 
S=Q (1, Par +++» Pn Q- (20) 


From (20) it follows that S=S= 8". 
This proves the lemma. | 
Using the lemma we shall now prove the following theorem: 


THEOREM 2: Every unitary matrix U can be represented in the form 
U = Re*, (21) 
where FR is a real orthogonal matrix and S a real symmetric matrix 


R=R=R'™-!1, S=S=ST. (22) 


6 XI. CompLEX SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL MatTRIcEs 


Proof. From (21) it follows that 
U* = eS R, (23) 
Multiplying (21) and (23), we obtain from (22) : 
UTU =e R' Ret8 =e, 
By Lemma 2, the real symmetric matrix S can be determined from the 
equation 
U'U= ee (24) 
because UTU is symmetric and unitary. After S has been determined, we 
determine F by the equation 


R=Ue"S. (25) 
Then 
RT =e—8U" (26) 


and so from (24), (25), and (26) it follows that 
R'R=e—8U' UeS =E, 
i.e., R is orthogonal. 

On the other hand, by (25) R is the product of two unitary matrices 
and is therefore itself unitary. Since R is both orthogonal and unitary, 
it is real. Formula (25) can be written in the form (21). 

This proves the theorem. 


§ 2. Polar Decomposition of a Complex Matrix 


We shall prove the following theorem: 
THEOREM 3: If A=|| du |i is a non-singular matrix with complex 
elements, then 
A=8SQ (27) 
and 
A= 0:81, (28) 


where S and 8; are complex symmetric matrices, Q and Q; complex orthogo- 
nal matrices. Moreover, 
S=YAA"=f(AA"), S,=VATA=},(A‘A), 


where f(A), fi(A) are polynomials in A. 
The factors S and Q im (27) (Q1 and 8; in (28)) are permutable tf and 
only if A and A‘ are permutable. 


§ 2. Potar DECOMPOSITION or CoMPLEX MaTRIXx 7 


Proof. It is sufficient to establish (27), for when we apply this decom- 
position to the matrix A‘ and determine A from the formula thus obtained, 
we arrive at (28). 

If (27) holds, then 


, A=8SQ, A'=@2'8 
and therefore 
AA‘ = §, (29) 
Conversely, since AA’ is non-singular (|A.A™| =| A |? 0), the function 


Va is defined on the spectrum of this matrix® and therefore an interpola- 
tion polynomial f(/) exists such that 


VAA'=f(AA"). (30) 
We denote the symmetric matrix (30) by 
S=yAA". 
Then (29) holds, and so | §|540. Determining Q from (27) 
Q=S"4, 


we verify easily that it is an orthogonal matrix. Thus (27) is established 
If the factors S and Q in (27) are permutable, then the matrices 


A=8Q and A'=Q'8 
are permutable, since 
AA'=S?, A"A=Q'8'Q. 
Conversely, if 4A*= ATA, then 
S= 7189, 


ie, Y is permutable with S*= AA". But then Q is also permutable with 
the matrix °Y=f(AA"). 
Thus the theorem is proved completely. 


2. Using the polar decomposition we shall now prove the following theorem : 
5 See Vol. I, Chapter V, §1. We choose a single-valued branch of the function Vi 


in a simply connected domain containing all the characteristic values of 4A’, but not the 
number 0. 


8 XI. CoMpLEX SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL MATRICES 


THEOREM 4: If two complex symmelne.s or skew-symmetric or emeRagonat 
matrices are similar: 


B=T" AT, (31) 


then they are orthogonally similar; 1.e., there exists an orthogonal matrix Q 
such that 


B=Q-14Q. (32) 


Proof. From the conditions of t!.c theorem there follows the existence 
of a polynomial g(/) such that 


A’'=q(A), B'=q(B)- (33) 


In the case of symmetric matrices this polynomial q(A) is identically equal 
to A and, in the case of skew-symmetric matrices, to —A. {ff A and B are 
orthogonal matrices, then q(A) is the interpviation polynomial for 1/2 on 
the common spectrum of A and B. 

Using (33), we conduct the proof of our theorem exactly as we did the 
proof of the corresponding Theorem 10 of Chapter IX in the real case 
(Vol. I, p. 289). From (31) we deduce 


q(B) = 2-19 (A) P 


or by (33) 

B=T I1A4'T. 
Hence. 

B=T'AT. 


Comparing this equation with (31), we easily find: 
TT'A=ATT". (34) 
Let us apply the polar decomposition tc vhe non-singular matrix T 
T=S8Q (S=ST=/f(TT), QT=Q-). 


Since by (34) the matrix TT™ is permutable with A, the matrix 
S=f(TT") is also permutable with A. Therefore, when we substitute the 
product SQ for T in (31), we have 


B=Q2 71'S" ASQ=@Q-4Q. 


This completes the proof of the theorem. 


§ 3. NorMaL Form or CompLtex SyMMETRIC MatRIx 9 


§ 3. The Normal Form of a Complex Symmetric Matrix 


1. We shall prove the following theorem:. 


TurorEM 5: There exists a complex symmetric matrix with arbitrary 
preassigned elementary divisors.® 


Proof. We consider the matrix H of order in which the elements of 
the first superdiagonal are 1 and all the remaining elements are zero. We 
shall show that there exists a symmetric matrix S similar to 7: 


S=THT—. (35) 
We shall look for the transforming matrix T starting from the conditions : 
S=THTH=S=71f0'T". 


This equation can be rewritten as 


VH=H'y, (36) 
where V is the symmetric matrix connected with T by the equation’ 
TT T=—2iv. (37) 


Recalling properties of the matrices H and F = H* (Vol. I, pp. 138-14) 
we find that every solution V of the matrix equation (36) has the following 
form: 


e 


0... O @ | 
(Ay 
y=|. °C , (38) 


0 ad 


ao a e e ° An—t1 
where Qo, @1,..., @n—1 are arbitrary complex numbers. 
Since it is sufficient for us to find a single transforming matrix T’, we 


set @ =1, a1 =...=a@,—1=0 in this formula and define V by the equation® 
0...0 1 

y=||9 ---2 Oj (39) 
1...0 0 


6In conneetion with the contents of the present section as well as the two sections 
that follow, §§ 4 and 5, see [378]. 

7 To simplify the following formulas it is convenient to introduce the factor — 21. 

8 The matrix V is both symmetric and orthogonal. 


10 XI. ComPpLEX SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL Matrices 
Furthermore, we shall require the transforming matrix T to be symmetric: 
TT". (40) 
Then the equation (37) for T can be written as: 
T?—— 217. (41) 


We shall now look for the required matrix T in the form of a polynomial 
in V. Since V?= £, this can be taken as a polynomial of the first degree: 


T=ck + BY. 
From (41), taking into account that V?= E, we find: 
a2+ B2=—0, 2af=——2s. 
We can satisfy these relations by setting a=1, B=—7. Then 
T= E—v+V. (42) 
T is a non-singular symmetrix matrix.® At the same time, from (41): 
P= 5iVIT =3iVT. 
1.€., 
T= (E +3). (43) 


Thus, a symmetric form S of H is determined by 


0.4.01 
§=THT- =4.(E—iv)H(B+iv), V=|o °°} Of. (4) 
1...0 0 


Since § satisfies the equation (36) and V*=E, the equation (44) can 
be rewritten as follows: 


2S=(H + H")+%4(HV—VH) 


01... 0 GS wws 2-0 
oe —_ 
—] 
| ee ea | fe | (45) 
= oo i; ; 
0... 1 °0 O°-—-1b... OF 


9 The fact that T is non-singular follows, in particular, from (41), beeause V is non- 
singular. 


§ 3. NormaL Form or CoMpLeEX SYMMETRIC MATRIX 11 


The formula (45) determines a symmetric form S of the matrix H. 

In what follows, if is the order of H, H = H“™), then we shall denote the 
corresponding matrices T, V, and S by T™, V™ and S™), 

Suppose that arbitrary elementary divisors are given: 


(A—A)", (A—A,), 0.2, (A--AL)Pe. (46) 
We form the corresponding Jordan matrix 
J ={ A, B®) + Heo, AH) +. HP)... ; A, E(eu) 4. Ay (Pu) } . 


For every matrix Hi) we introduce the corresponding symmetric form 
S@), From 


S°) = T) HO [THN (GF = 1,2, .:., w) 
it follows that 
y| , Ee) + Se) = TP) [ A, E(ry) 4. H°3)) [Ter] , 


Therefore setting 


S = {A,B + geo, AgBO) +. G9, ©... 2H 4 Slow}, (47) 
T={T?), TO), ..., Tew}, (48) 
we have: 
S=TITH, 


§ is a symmetric form of J. § is similar to J and has the same elementary 
divisors (46) as J. This proves the theorem. 


CorotLary 1. Every square complex matric A= || Qu ||? ts similar to a 
symmeiric matriz. 

Applying Theorem 4, we obtain: 

CoroLuary 2. Every complex symmetric matrix S = | Quix i ts orthogo- 


nally similar to a symmetric matriz with the normal form 8, v.€., there exists 
gn orthogonal matriz Q such that 


The normal form of a complex symmetric matrix has the quasi-diagonal 
form 


S={ A,B + 8), 1,BO+ Sd, .. |, ABP) + Strw) } (50) 


where the blocks S‘) are defined as follows (see (44), (45)): 


12 XI. CompLtex SYMMETRIC SKEW-SYMMETRIC, ORTHOGONAL MATRICES 


ge) — > EO) — iV] HO) (HO) + ivo 
=> (H@) + HH 4-¢(H® ye)— ye) He))] 
o1... O {oO ... 1 0} 
1 . 24 
ar . e * e e +4 e 3 . ; : i | e (61) 
0 1 0 lo—-1 ... of 


§ 4. The Normal Form of a Complex Skew-symmetric Matrix 


I. We shall examine what restrictions the skew symmetry of a matrix 
imposes on its elementary divisors. In this task we shall make use of the 
following theorem: 


THEOREM 6: ‘A skew-symmetric matrix always has even rank. 

Proof. Let r be the rank of the skew-symmetric matrix K. Then K has 
r linearly independent rows, say those numbered 1, 22, . . . , +,; all the remain- 
ing rows are linear combinations of these r rows. Since the columns of K 
are obtained from the corresponding rows by multiplying the elements by 
— 1, every column of K is a linear combination of the columns numbered 
41, t2,...,%. Therefore every minor of order r of K can be represented in 
the. form 


where a is a@ constant. 
Hence it follows that 


4, t, .6. 4 1 
K (237° x0. 
t,t)... $, 

But a skew-symmetric determinant of odd order is always zero. There- 
fore r is even, and the theorem is proved. 


TuEOREM 7: If A, is a characteristic value of the skew-symmetric matriz 
K with the corresponding elementary divisors 


(A—Ay)/, (A—Ay)4, eeey (A—A))ft, 


then —- A, 1s also a characteristic value of K with the same number and the 
same powers of the corresponding elementary divisors of K 


§4. NorRMAL Form oF COMPLEX SKEW-SYMMETRIC Matrix 13 
(At Agi, (A+ Ag, co. (At Ap) 


2. If zero ts a characteristic value of the skew-symmetric matrix K,’° 
then in the system of elementary divisors of K all those of even degree cor- 
responding to the characteristic value zero are repeated an even number 
of tumes. 

Proof. 1. The transposed matrix K™ has the same elementary divisors 


as K. But K™=—K, and the elementary divisors of — K are obtained 
from those of K by replacing the characteristic values (1, As, .... by —A1, 
— A z,.... Hence the first part of our theorem follows. 


2.° Suppose that to the characteristic value zero of K there correspond 4, 
elementary divisors of the form A, d2 of the form 4, etc. In general, we 
denote by 6, the number of elementary divisors of the form 4? (p= 1, 2,...). 
We shall show that 62, d4,... are even numbers. 

The defect d of K is equal to the number of linearly independent charac- 
teristic vectors corresponding to the characteristic value zero or, what is the 
same, to the number of elementary divisors of the form A, 2?, A°,.... There- 
fore 

d=6,+6,+6, +--+. (52) 

Since, by Theorem 6, the rank of K is even and d =n —r, d has the same 
parity as. The same statement can be made about the defects ds, ds, ... of 
the matrices K?, K*, ..., because odd powers of a skew-symmetric matrix 
are themselves skew-symmetric. Therefore a!l the numbers d, = d, d;, ds,... 
have the same parity. 

On the other hand, when K is raised to the m-th power, every elementary 
divisor 4* for p < m splits into p elementary divisors (of the first degree) 
and for p = m into m elementary divisors.‘*! Therefore the number of ele- 
mentary divisors of the matrices K, K*,... that are powers of 4 are deter- 
mined by the formulas’’ 

d,=0, + 26,+3(d3+46,+°°°), 
d, =6, + 26,+36,+46,+5(6,+6,+-°"), (53) 


Comparing (52) with (53). and bearing in mind that all the numbers 
d, = 4d, ds, ds,... are of the same parity, we conclude easily that d2, d,,...are 
even numbers. 

This completes the proof of the theorem. 


10 J.e,, if | K {= 0. For odd n we always have | K |= 0. 

11 See Vol. I, Chapter VI, Theorem 9, p. 158. 

12 These formulas were introduced (without reference to Theorem 9) in Vol. I, Chapter 
VI (see formulas (49) on p. 155). 


14 XI. CompLex Symmetric, SKEW-SYMMETRIC, ORTIIOGONAL MATRICES 


2. THEOREM 8: There exists a skew-symmetric matriz with arbitrary pre- 
assigned elementary divisors subject to the restrictions 1., 2. of the pre- 
ceding theorem. 


Proof. To begin with, we shall find a skew-symmetric form for the 
quasi-diagonal matrix of order 2p: 


JQ?) ={ AE + H,—AgE—H} (54) 
having two elementary divisors (A—4A,)? and (A+A,)?; here E= E*?? 
H=H°?). 
We shall look for a transforming matrix 7 such that 
TILT 
is skew-symmetric, 1.e., such that the following equation holds: 
TILT + 1 [JL T= 0 
or 
WIR” + LIP W=0, (55) 
where W is the symmetric matrix connected with T by the equation’ 
T= —2iW. (56) 
We dissect W into four square blocks each of order p: 
v= bil tb 
Wa Woe 
Then (55) can be written as follows: 
eh tag Pe H O 
Wo W oo O —A EH —H 
“a O ) be Wie\ 
OQ —AE—H'] \W., Wool. 
When we perform the indicated qperations on the partitioned matrices 


on the left-hand side of (57), we replace this equation by four matrix 
equations: 


O. (57) 


HW + Wy (24E +H) =O, 
H'W,,.—W,,H=0, 
H’W.,—W,,H=0, 
HH Wee.+ W oe (2 A,H + H) =0. 


(58) 


pees Oy ee 


18 See footnote 7 on p. 9» 


§4. Norma. Form or CoMPLEX SKEW-SYMMETRIC MATRIX 15 


The equation AX — XB =O, where A and B are square matrices without 
common characteristic values, has only the trivial solution XY =0O.'* There- 
fore the first and fourth of the equations (58) yield: Wy,;= Wo.=0."° 
As regards the second of these equations, it can be satisfied, as we have seen 
in the proof of Theorem 5, by setting 


Oo... 901 
oO... 10 
Wi3=V= ’ (59) 
1 0 0 
since (ef. (36) ) 
VH—H'V=0O. 
From the symmetry of W and V it follows that 
Wo — Wi.= Y. 
The third equation is then automatically satisfied. 
Thus, 
O V 
= = pl2?P), 60) 
¥=(7 9) 


But then, as has become apparent on page 10, the equation (56) wil! be 
satisfied if we set ° 


[= 2?) —j{per, (61) 
Then 
T-1 — = (E@?) + 472?)), (62) 


Therefore, the required skew-symmetric matrix can be found by the formula’® 
K&) — > (Be? — sy2ry yO) He”) 4 sper) 
_ > (Je? ar ae ise” Ve Pp) yp? Pp) JEP) : (63) 


When we substitute for J%’ and V2) the corresponding partitioned 
matrices from (54) and. (60), we find: 


14 Sée Vol. I, Chapter VIII,:§ 1. 
15 For A, +0 the equations 1. and 4. have no solutions other than zero. For A,»=0 
there exist other solutions, but we choose the zero solution. 
16 Here we use equations (55) and (60). From these it follows that 
| Veen) IPP) yeep) — —IPP)T 


16 XI. ComPpLEX SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL MATRICES 


1{/H—-H™ O (AHtH oO Oo YV 
KY? — — )+i(® 
re O H'—H}' \..0 —AE—H/\V O 


(yo) (Mo ana) 


7 +( H—H™ i(2aV+HV + a (64) 
~ 2\-i24V+HV 4+ VH) H'—H 
1.é., 
0 ae ee ee 020-2. ¢ a 4% « é 22 
— 0° 3 2Ag 4 
| 
2 
‘ i 
= ee | ce ee ae 0. Bho ' ON 65: 
2 0 —i —24:0 —1. ... 0 
: —24, —i 1, 0 
| 240 —i ... 0 0 ] 0! 


We shall now construct a skew-symmetric matrix ‘9 of order g having 
one elementary divisor A%, where qg is odd. Obviously, the required skew- 
symmetric matrix will be similar to the matrix 


021°0.......0 
001 
Jo = "a> : (66) 
oe, Ce 
| 


§4. NormaL Form or ComtLex SKEW-SYMMETRIC Matrix 17 


In this matrix all the elements outside the first superdiagonal are equal to 
zero, and along the first superdiagonal there are at first (gy—1)/2 ele- 
ments 1 and then (q—1)/2 elements —1. Setting 


kea7y" Tr , (67) 
we find from the condition of skew-symmetry : 
WJ +79 wi=0, (68) 
where 


TT =—2iW,. (69) 


By direct verification we can convince ourselves that the matrix 


/o . .01 

ee 0. 10] 
pe eee 

1... 00] 


satisfies the condition (68). Taking this value for W, we find from (69), 
as before: 


T— E” —ip™, > (Ee? a iv, (70) 
K® = = ey _ iv) J? (Ee? he iv] 
= - ry a yet 44 (y@ y® —_ yp? J)), (71) 


When we perform the corresponding computation, we find: 


OF de oe ae 0 0 1 0 
—1 00 
l 
9K? — 5 ~ ’ ° 3 . +4 (72) 
ke : = . 
0 1 0; -0 —!1 a ‘| 


Suppose that arbitrary elementary divisors are given, subject to the 
conditions of Theorem 7: 


18 XI. ComMPLEX SYMMETRIC, SKEW-SYMMETRIC. ORTHOGONAL MATRICES 


(A—A,)"5, (A+ A)” (j=1, 2, ces u), 
Ate (k=1, 2, ..-, U3 G1) Yqr «+ +» Yo are odd numbers).?” (73) 


Then the quasi-diagonal skew-symmetric matrix 
K = { Ken ; F @ egia F K* K@ } (7 4) 
; satis : Seater 
has the elementary divisors (73). 


This concludes the proof of the theorem. 


CoroLLary: Every complex skew-symmetric matrix K 1s orthogonally 
similar to a skew-symmetric matrix having the normal form K determined 
by (74), (63), and (72) ; 2.e., there exists a (complex) orthogonal matrix Q 
‘such that 


K=QKqQ-. - (15) 


Note. If K is a real skew-symmetric matrix, then it has linear ele- 
mentary divisors (see Vol. I, Chapter IX, § 13). 


A—ig,. At-ip,,...,.A—ip,, A+ ig, A,..., A (gy, are real numbers). 


v times 


In this case, setting all the p;=1 and all the gq, =1 1n (74), we obtain as 
the normal form of a real skew-symmetric matrix 
ae . 


aa oe 0 es 0 
§ 5. The Normal Form of a Complex Orthogonal Matrix 


1. Let us begin by examining what restrictions the orthogonality of a 
matrix imposes on its elementary divisors. 


TuEorREM 9: 1. If Ay (AQ 41) ts a characteristic value of an orthogonal 
matrix Q and if the elementary divisors 


(A—AgMt, (A—Ag)s, .. 4, (A— Ag) 


17 Some of the numbers Au, Ax, ..., Au may be zero. Moreover, one of the numbers u 
and v may be zero; i.c., in some eases there may be clementary divisors of only one type. 


§ 5. Normal Form or CoMPLEX ORTHOGONAL Matrix 19 


correspond to this characteristic value, then 1/4, ts also a characteristic value 
of @ and it has the same corresponding elementary divisors : 


(A—ASNM, (A—ADTN, 20, (A—AT)*. 


2. If 4g = +118 a characteristic value of the orthogonal matrix Q, then 
the elementary divisors of even degree corresponding to A, are repeated an 
even number of times. 

Proof. 1. For every non-singular matrix Q on passing from Q to Q7! 
each elementary divisor (A—4A,)f is replaced by the clementary divisor 
(A —Az1)f28 On the other hand, the matrices Q and Q™ always have the 
same elementary divisors. Therefore the first part of our theorem follows 
at once from the orthogonality condition Q™ = Q7 

2. Let us assume that the number 1] is a characteristic value of Q, while 
—lLis not (| E—Q|=0,|#+Q| +0). Then we apply Cayley’s formulas 
(see Vol. I, Chapter IX, § 14), which remain valid for complex matrices. 
We define a matrix K by the equation 


K =(E—Q) (E+Q). (76) 


Direct verification shows that K* = — K, so that K is skew-symmetric. 

When we solve the equation (76) for Q, we find: 
Q=(E—K) (£+K)—. 

1—A ; 2 
Setting f(A) = iqa> we have f’(A) =—ayap 
tion from K to @=f(K) the elementary divisors do not split.” Hence in 
the system of elementary divisors of Q those of the form (A —1)?? are re- 
peated an even number of times, because this holds for the elementary 
divisors of the form 4? of K (see Theorem 7). 

The case where Q has the characteristic value — 1, but not + 1, is reduced 
to the preceding case by considering the orthogonal matrix — Q. 

We now proceed to the most complicated case, where Q has both the 
characteristic value +1 and —1. We denote by w(A) the minimal poly- 
nomial of Q. Using the first part of the theorem, which has already been 
proved, we can write y(/) in the form 


<0. Therefore in the transi- 


18 See Vol. I, Chapter VJ, §7. Setting f(2) = 1/A, we have f'(4) =— 1/1? 0. 
Hence it follows that in the transition from Q to Q-’ the elementary divisors do not split 
(see Vol. I, p. 158). 

19 Note that (76) implies that FE + K = 2(E + Q)~ and therefore 

|\E+K\/—2*|F+qQ|" 0. 

20 See Vol. I, p. 158. 


20 XI. ComMpLex SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL MATRICES 


p (a) =(A—1)™ (A + un (A—A)P5(A— Ap) (AP41; G=1,2,... , wu). 


We consider the polynomial g(A) of degree less than m (m is the degree 
of y(A)) for which g(1) =1 and all the remaining m—1 vaiues on the 
spectrum of Q are zero; and we set :?? 


Note that the functions (g(A) )? and g(1/2) assume on the spectrum of Q 
the same values as g(A). Therefore 


P?=P, P*=g(Q")=9(0") =P, (78) 


Le., P is a symmetric projective matrix.?? 
We define a polynomial h(A) and a matrix by the equations 


h(a) =(4—1) 9 (A), (79) 
N=h(Q)=(Q—E)P. (80) 


Since (h(A))™= vanishes on the spectrum of Q, it is divisible by w(A) 
without remainder. Hence: 


N" = O, 


1e., NV is a nilpotent matrix with m, as index of nilpotency. 
From (80) we find :?° 


NT= (Q"—E) P. (81) 
21 From the fundamental formula (see Vol. I, p. 104) 


a 
g(A)= DS? [g (Ax) Zar + 9” (As) Zea + ---] 
kewl 
it follows that 
p= Zu. 
22 A hermitian operator P is called projective if P? =P. In accordance with this, 
a hermitian matrix P for which P?= P is ealled projective. An example of a projective 
operator P in a unitary space R is the operator of the orthogonal projection of a vector 
xeR into a subspace S = PR, i.e., Px = Xg. Where Xg€ S and (x— xg) 1 S (see Vol. I, 
p. 248). 
243 All the matrices that occur here, P, N, NT, Q@'—Q-} are permutable among each 
other and with Q, since th¢y are all functions of Q. 


§o. Normau Form or CoMPLEX ORTHOGONAL MatTRIX 21 


Let us consider the matrix 
R=N(N'+22). (82) 
From (78), (80), and (81) it follows that 
R= NN*+2N=(Q—Q") P. 


From this representation of F it is clear that F is skew-symmetric. 
On the other hand, from (82)- 


Ri= N*(N™+2E)* (k=1,2,...). (83) 
But N", like N, is nilpotent, and therefore 
|\N°+2E|A0. 


Hence it follows from (838) that the matrices R* and N* have the same rank 
for every k. 

Now for odd & the matrix R* is skew-symmetric and therefore (see p. 12) 
has ever rank. Therefore each of the matrices 


NNO NY isc 


has odd rank. 

By repeating. verbatim for N the arguments that were used on p. 13 
for K we may therefore state that among the elementary divisors of N those 
of the form /?? are repeated an even number of times. But to each elec- 
mentary divisor 1? of N there corresponds an elementary divisor (A — 1)? 
of Q, and vice versa.”* Flence it follows that among the elementary divisors 
of @ those of the form (A — 1)? are repeated an even number of times. 

We obtain a similar statement for the elementary divisors of the form 
(A +1)? by applying what has just been proved to the matrix — Q. 

Thus, the proof of the theorem is complete. 


2. We shall now prove the converse theorem. 


24 Since h(1) = 0, h’(1) #0, in passing from @ to N =h(Q) the elementary divisors 
of the form (A — 1)?” of Q do not split and are therefore replaced by elementary divisors 
47? (see Vol. I, Chapter VI, § 7). 


22 XI. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, OxTHOGONAL MATRICES 


THEOREM 10: Every system of powers of the form 


(A—A)", (A— Aj")? (A403 F=1, 2,..., u), 

(A— 1)", (A—1)™, ..., (A—1)”, (84) 
(A+ 1)%, (A+ 1)%, ..., (A+ 1)” e 
(Gi, «+ +> Yur ty,» -, t, are odd numbers) 


is the system of elementary divisors of some complex orthogonal matriz Q.”° 


Proof. We denote by yu; the numbers connected with the numbers 4; 
(j=1, 2,...,%) by the equations 


Aj=et (f= 1,2,..-, 4) 


We now introduce the ‘canonical’ skew-symmetric matrices (see the pre. 
ceding section ) 


Ky? (J=1,2,..., Ws KO, ..., KO; KO, ..., Keo, 


} 


with the elementary divisors 
(A—p)", (A+ uy)? G=1, B,..., U) At, ..., Atos An, .., Abe. 
If K is a skew-symmetric matriz, then 
Q=e* 


is orthogonal (Q’= eX" =e-X = Q-™). Moreover, to each elementary divi- 
sor (A — u)? of K there corresponds an elementary divisor (A — e#)? of Q.”° 
Therefore the quasi-diagonal matrix 


ee ) x PuPu) ) 
OI cuss EM <8 Pie Pe 


K(f) Rh) 
- —@. 


o°% , a 


Rite) } 


(85) 


is orthogonal and has the elementary divisors (84). 
This proves the theorem. 
From Theorems 4, 9, and 10 we obtain: 


25 Some (or even all) of the numbers A; may be +1. One or two of the numbers 
u,v, w may be zero. Then the elementary divisors of the corresponding type are absent 
in Q. 

26 This follows from the faet that for f(A) =e we have f’(A) ==e4 0 for every A. 


§ 5. Normat Form or CoMPLEX ORTHOGONAL MatRIXx 23 


CoRoLLARY: Every. (complez) orthogonal matrix Q is orthogonally 
stmilar to an orthogonal matriz having the normal form Q; 1.e., there exists 
an orthogonal matriz Q, such that 


2=2,007. (86) 


Note. Just as we have given a concrete form to the diagonal blocks in the 
skew-symmetric matrix K, so we could for the normal form Q.*" 


27 See [378]. 


CHAPTER XII 


SINGULAR PENCILS OF MATRICES 


§ 1. Introduction 


1. The present chapter deals with the following problem: 


Given four matrices A, B, Ai, B, all of dimension m X n with elements 
from a number field ¥, 1t 1s required to find under what conditions there 
exist two square non-singular matrices P and Q of orders m and n, respec- 
tively, such that* 


PAQ=A,, PBQ=B, (1) 


By introduction of th.; pencils of matrices 4 +4AB and A; + 4B, the 
two matrix equations (1) «an be replaced by the single equation 


P(A+AB)Q=A,+AB, (2) 


DEFINITION 1: Two pencils of rectangular matrices A + AB and A; + AB, 
of the same dimensions m X n connected by the equation (2) in which P and 
Q are constant square non-singular matrices (1.e., matrices independent of 
A) of orders m and n, respectively, will be called strictly equivalent.” 


According to the general definition of equivalence of A-matrices (see 
Vol. I, Chapter VI, p. 132), the pencils A + AB and A; + AB, are equivalent 
if an equation of the form (2) holds in which P and Q are two square 
A-matrices with constant non-vanishing determinants. For strict equivalence 
it is required in addition that P and Q do not depend on 4.8 

A eriterion for equivalence of the pencils A + AB and A, + AB, follows 
from the general criterion for equivalence of A-matrices and consists in the 
equality of the invariant polynomials or, what is the same, of the elementary 
divisors of the pencils A + AB and A, + AB, (see Vol. I, Chapter VI, p. 141). 


1Tf such matrices P and @ exist, then their clements can be taken from the field F. 
This follows from the fact that the equations (1) ean be written in the form PA = 4,Q”, 
PB = B,Q and are therefore equivalent to a certain system of linear homogeneous equa- 
tions for the elements of P and Q-' with coefficients in F. 

2 See Vol. I, Chapter VI, p. 145. 


3 We have replaced the term ‘equivalent pencils’ that occurs in the literature by 
‘strictly equivalent pencils,’ iu order to draw a sharp distinetion between Definition 1 and 
the definition of equivalence in Vol. I, Chapter VI. 


24 


§ 2. ReeuLar PENcILS or Matrices 25 


In this chapter, we shall establish a criterion for strict equivalence of 
two pencils of matrices and we shall determine fur each pencil a strictly 
equivalent canonical form. 


Z. The task we have set ourselves has a natural geometrical interpretation. 
We consider a pencil of linear operators A + 4B mapping R, into R,,. For 
a definite choice of bases in these spaces the pencil of operators A + AB cor- 
responds to a pencil of rectangular matrices A -+ AB (of dimension m X n) ; 
under a change of bases in R, and R,, the pencil A + AB is replaced by a 
strictly equivalent pencil P(A +AB)Q, where P and Q are square non- 
singular matrices of order m and n (see Vol. 1, Chapter III, §§ 2 and 4). 
Thus, a criterion for strict equivalence gives a characterization of that class 
of matrix pencils A + AB (of dimension m X n) which describe one and the 
same pencil of operators 4 + AB mapping R, into R,, for various vhoices of 
bases in these spaces. 

In order to obtain a canonical form for a pencil it is necessary to find 
bases for R, and R,, in which the pencil of operators A + AB is described by 
matrices of the simplest possible form. 

Since a pencil of operators is given by two operators A and B, we can 
also say that: The present chapter deals with the simultaneous investigation 
of two operators A and B mapping R, into Ry. 

3. All the pencils of matrices A + 4B of dimension m X n fall into two 
basic types: regular and singular pencils. 


DEFINITION 2: A pencil of matrices A + AB 1s called regular tf 
1) A and B are square matrices of the same order n; and 


2) The determanant | A+AB| does not vanish identically. 
In all other cases (mn, or m=n but | A + AB | =0), the pencil ts called 
singular. 

A criterion for strict equivalence of regular pencils of matrices and also 
a canonical form for such pencils were established by Weierstrass in 1867 
[377] on the basis of his theory of elementary divisors, which we have ex- 
pounded in Chapters VI and VII. The analogous problems for singular 
pencils were solved luter, in 1890, by the investigations of Kronecker [249].* 
Kronecker’s results form the primary content of this chapter. 


§ 2. Regular Pencils of Matrices 


1. We consider the special case where the pencils A+AB and A,+AB, 
consist of square matrices (m= n) | B| ~0,|Bi| 0. In this case, as we 
have shown in Chapter VI (Vol. I, pp. 145-146), the two concepts of ‘equiv- 

4 Of more recent papers dealing with singular pencils of matrices we mention [234], 
[369], and [255]. 


26 XIT. Srnauvar PENcILS oF MATRICES 


alence’ and ‘strict equivalence’ of pencils coincide. Therefore, by applying 
to the pencils the general criterion for equivalence of A-matrices (Vol. I, 
p. 141) we are led to the following theorem: 


THEOREM 1: Two pencils of square matrices of the same order A+ AB 
and A, + AB, for which | B| #0 and | B, | #0 are strictly equivalent tf and 
only if the pencils have the same elementary divisors in F. 

A pencil of square matrices A + AB with | B | 540 was called regular in 
Chapter VI, because it represents a special case of a regular matrix poly- 
nomial in A (see Vol. I, Chapter IV, p. 76). In the preceding section of this 
chapter we have given a wider definition of regularity. According to this 
definition it is quite possible in a regular pencil to have | B | =0 (and even 
| 4|=|B|=0). 

In order to find out whether Theorem 1 remains valid for regular pencils 
(with the extended Definition 1), we consider the following example: 


213 jet 2 221 111 
A+AB=}/3 2 5|) 44/1 1 2), 4,4+AB,=),1 2 1144+ A} 1 1 14). (3) 
326 lial aaal 111 


It is easy to see that here each of the pencils A + AB and A, + AB, has 
only one elementary divisor, 4-+1. However. the pencils are not strictly 
equivalent, since the matrices B and B, are of ranks 2 and 1, respectively ; 
whereas if an equation (2) were to hold, it would follow from it that the 
ranks of B and B,; are equal. Nevertheless, the pencils (3) are regular 
according to Definition 1, since 


|A+AB|=|A,+4B,|=44+1. 


This example shows that Theorem 1 is not true with the extended defini- 
tion of regularity of a pencil. 


2. In order to preserve Theorem 1, we have to introduce the concept of ‘in- 
finite’ elementary divisors of a pencil. We shall give the pencil A + AB in 
terms of ‘homogeneous’ parameters A, uw: uA +AB. Then the determinant 
A(A, #) =| uA + AB] is a homogeneous function of 4, wu. By determining 
the greatest common divisor D,(A, uw) of all the minors of order / of the 
matrix wA + AB (k=1, 2,..., 7), we obtain the invariant polynomials by 
the well known formulas 


_ Da(Au) . Dy (A, #) 
y (A, w) = D,_, (A, wu)’ te (A, w) ae Dare (en) pees y 


here all the D,(A, u) and 1;(A, #) are homogeneous polynomials in 4 and u. 


§ 2. ReGguLar PENCILS oF MATRICES 27 


Splitting the invariant polynomials into powers of homogeneous polynomials 
irreducible over F, we obtain the elementary divisors é,(A, uw) (a=1], 2,...) 
of the pencil nA + AB in P. 

It is quite obvious that if we set «= 1 in ég(A, #%) we are back to the ele- 
mentary divisors €,(4) of the pencil A+4AB. Conversely, from each ele- 
mentary divisor e,(A) of degree g we obtain the correspondingly elementary 
divisor e,(A, uw) b; the formula ea (A, u) = py! ex (). We can obtain in this way 
all the elementary uivisors of the pencil »A + AB apart from those of the 
form p%. 

Elementary divisors of the form y? exist if and only if | B | =0 and are 
called ‘infinite’ elementary divisors of the pencil A + AB. 

Since strict equivalence of the pencils A +AB and A,+ AB, implies 
strict equivalence of the pencils uA + AB and wA, + 4B,, we see that for 
strictly equivalent pencils 4 + AB and A, + AB, not only their ‘finite,’ but 
also their ‘infinite’ elementary divisors must coincide. 

Suppose now that .+AB and A, +AB, are two regular pencils for 
which all the elementary divisors coincide (including the infinite ones). 
We introduce homogeneous parameters: uA + AB, uA, +AB,. Let us now 
transform the parameters 


A =a,A + Mf, p =B,A + Bo (a8, — a8, 0). 


In the new parameters the pencils are written as follows: 
pa + I ? pA, + AB, ; where B = f,A + a,B, B, = BA, + a,B,. 


From the regularity of the pencils uA + AB and vA, + AB, it follows that we 
can choose the numbers a; and /; such that | B | 40 and | B, | £0. 

Therefore by Theorem 1 the pencils 7A + AB’ and fA, + AB, and con- 
sequently the original pencils uA + AB and wA, + AB, (or, what is the same, 
A+AB and A; + AB;) are strictly equivalent. Thus, we have arrived at 
the following generalization of Theorem 1: 


THEorREM 2: Two regular pencils A+JAB and A,+ AB, are strictly 
equivalent if and only if they have the same (‘finite’ and ‘infinite’) ele- 
mentary divisors. 


In our example above the pencils (3) had the same ‘finite’ elementary 
divisor 4 +1, but different ‘infinite’ elementary divisors (the first pencil 
has one ‘infinite’ elementary divisor yu? ; the second has two: uw, “). Therefore 
these pencils turn out to be not strictly equivalent. 


28 AIT. Sincuuar Pencits or MaTRIces 


3. Suppose now that A+ AB is an arbitrary regular pencil. Then there 
exists a number ¢ such that | A+cB|+0. We represent the given pencil 
in the forra A, + (A—c)B, where 4, =A+cB, so that |A,|~0. We 
multiply the pencil on the left by Az!: E+ (A—c)Aj'B. By a similarity 
transformation we put the pencil in the form® 


E+ (A—ce) Jo,J, }={E—J.4+ Wy, E—c,-+ dA}, (4) 


where {Jo,/1} is the quasi-diagonal normal form of AjB, Jo is a nilpotent 
Jordan matrix,® and | J; | <0. 

We multiply the first diagonal block on the right-hand side of (4) by 
(# —c?,)—! and obtain: EF + A(E —eJ,)~ Jo. Here the coefficient of J 
is a nilpotent matrix.’ Therefore by a similarity transformation we can 
‘put this pencil into the form® 


B+4Jyp={ NC, NO, 2... NO) (NO = BO 4 AHO), (5) 


We multiply the second diagonal block on the right-hand side of (4) by 
Jz!; it can then be put into the form J + AE by a similarity transformation, 
where J is a matrix of normal] form® and E the unit matrix. We have thus 
arrived at the following theorem: 


THEOREM 3: Every regular pencil A + AB can be reduced to a (strictly 
equivalent) canonical quasi-diagonal form 


[NM , NM), 2. NOS), J+ AE) (NO = BO + 2H), (6) 


where the first s diagonal blocks correspond to infinite elementary divisors 
pe, wu, 2. “sof the pencil A + AB and where the normal form of the last 
diagonal block J+AE 1s umiquely determined by the finite elementary 
divisors of the given pencil. — . 


>The unit matrices E in the diagonal blocks on the right-hand side of (4) have the 
same order as Jo and Ji. 

6 J.e., Ju’ = O for some integer 1 >.0. 

7 From Ju! = O it follows that [ (2 — cJo) Jo)! = O. 

8 Here E(™) is a unit matrix of order wu and H(*) is a matrix of order u whose elements 
in the first superdiagonal are 1, whilc the remaining elements are zero. 

® Since the matrix J can be replaced here by an arbitrary similar matrix, we may 


assume that J has one of the normal forms (for example, the natural form of the first 
or second kind or the Jordan form (see Vol. I, Chapter VI, §6)). 


§ 3. Smveuuar Pencits. Tht Repuction THeoreM eu 


§ 3. Singular Pencils. The Reduction Theorem 


1. -We now proceed to consider a singular pencil! of matzices A + /B of 
dimension m X n. We denote by r the rank of the pencil, i.2., the largest 
of the orders of minors that do not vanish identically. From the singu- 
larity of the pencil it follows that at least one of the inequalities r << n and 
r<m holds, say r< mn. Then the columns of the A-matrix 4+ 1B are 
linearly dependent, i.e., the equation 


(A + AB) z=—0, (7) 


where z is an unknown column matrix, has a non-zero sclution. Every 
non-zero solution of this equation determines some dependence among the 
columns of A +AB. We restrict ourselves to only such solutions 2(A) of (7) 
as are polynomials in 4,!° and among these solutions we choose one of ‘east 
possible degree e: 


x (A) = % — Ax, + Ax. — ooo + (— 1)°A° a, (2.540). (8) 


Substituting this solution in (7) and equating to zero the coefficients of 
the powers of A, we obtain: 


Azy=0, Buy—Ax,;=0, Bx,— Axg=0,..., Bu1—Ax,=—0, Bu,=o. (9) 
Considering this as a system of linear homogeneous equations for the 


elements of the columns 7,,— 2%, +Ze..., (—1)‘x., we deduce that the 
coefficient matrix of the system 


e+1 
A 0...0 
BA 
M.=M.[A+AB]=] 9 B- (10) 
4 
0 0...B 


is of rank pe << (€ +1)n. At the same time, by the minimal property of «, 
the ranks 04, 01, ..-, @e—1 of the matrices 


10 For the actual determination of the clements of the column -s satisfying (7) it is 
convenient to solve a system of lincar homogeneous equations in which the coefficients of 
the unknown depend linearly on A. The fundamental linearly independent solutions x can 
always be chosen such that their elements are polynomials in 4X. 


30 XII. Sincuvar PENCILS or MaTRICES 


€ 


AoOoO...0O 

A A O BA : 
M,=(4). M,=(B A),...,Ma=[-° 0 22] (0) 

0 B a. 

O irae 53 


satisfy the equations Q9= ”, 0; = 2N, ..., Oc) = en. 


Thus: The number e 1s the least value of the index k for which the sign 
< holds in the relation ox, S (k +1)n. 


Now we can formulate and prove the following fundamental theorem : 
2. THEorEM 4: If the equation (7) has a solution of minimal degree e and 


e > 0, then the given pencil A + AB 18 strictly equivalent to a pencil of 
the form 


my Pa ;) (11) 
O A+AB 
where 
e+1 
Al0...090 a 
0a - 
L,= a é, (12) 


0 oO Se eececke. 2 

and A+ AB is a pencil of matrices for which the equation analogous to (7) 
has no solution of degree less than e. 

Proof. Weshall conduct the proof of the theorem in three stages. First, 


we shall show that the given pencil A + /B is strictly equivalent to a pencil 
of the form 


EB D+ “A i) 


O A+AB)’ 


where D, F, A, B are constant rectangular matrices of the appropriate 
dimensions. Then we shall establish that the equation (A + AB)Z=0 has 
no solution z(A) of degree less than e«. Finally, we shall prove that by 
further transformations the pencil (13) can be brought into the quasi- 
diagonal form (11). : 


§3. Stncuuar Pencris. THE REDUCTION THEOREM 31 


1. The first part of the proof will be couched in geometrical terms. 
Instead of the pencil of matrices A + AB we consider a pencil of operators 
A+AB mapping R,, into R,, and show that with a suitable choice of bases 
in the spaces the matrix corresponding to the operator A + AB assumes the 


form (13). 
Instead of (7) we take the vector equation 
(4 +AB)x =o (14) 
with the vector solution 
x (A) =%)— Ax, + APx,—-+e + (— 1) Ae,; (15) 


the equations (9) are replaced by the vector equations 


Ax,—0, Ax,=Bx,, Ax,= Bx,,..., 4x,= Bx,_,, Bx,=o (16) 


Below we shall show that the vectors 
Ax,, Axo,..., Ax, (17) 


are linearly independent. Hence it will be easy to deduce the linear inde- 
pendence of the vectors 


Xo; 1) eoaey x, (18) 

For since Ax,=o we have from a@)%)+4,%,+:°°:+ «,*, =o that 
a,Ax,+---+a, A x,=0,s0 that by the linear independence of the vectors 
(17) a,=ag=...=a,=0. But x, ~0, since otherwise = x(A) would be 


a solution of (14) of degree e —1, which is impossiblé. Therefore a, = 0 
also. 

Now if we take the vectors (17) and (18) as the first « + 1 vectors for 
new bases in R,, and R,, respectively, then in these new bases the operators 
A and B, by (16), will correspond to the matrices 


e+1 e+i 
a | vere O Bowe * 10...0 0 x * 
001. 0 x * 0 1 0 0 x« * 
A=lo oO 1 we wl? B=l0 0..11 0 & oe. wll 
0 0 aR a5c6 OR 00...0 0 * ... » 
Bh as Aa es es tae ee a. 4 -0 O * ... 
0 0 oie: HR wes «| 00...9 0 ©... * 


32 XII. SINGULAR PENCILS OF MATRICES 


hence the A-matrix A +B is of the form (13). All the preceding argu- 
ments will be justified if we can show that the vectors (17) are linearly 
independent. Assume the contrary and let Ax, (h = 1) be the first vector 
in (17) that is linearly dependent on the preceding ones: 


By (16) this equation can be rewritten as follows: 


Bx,_, = «,Bx,_, + @,Bx,_, + +++ + %_,Bxp, 
1.é., 
Bxy_,=0o, 
where 


* pane 
Hy — %y_y — %%,_9 — Ho¥%py_g — *°* — Ky_%- 
Furthermore, again by (16), 
® eo a * 
Ax,_, = B(X,_.— %%,_3— "°° &,_ 9%) = Bx,_.; 
where 


* = oso 
X), 9 = Bp_g — 2% q_3 — Zn_2%q- 
Continuing the process and introducing the vectors 
+ ; + ——, * 
Xp _g = Bp_g — %y%p_g — °° — Ay_gX oy - 00, By = Hy — 2%, T —%X; 


we obtain a chain of equations 


Bx* ,.=0, Ax?_,=Bx'_,,..., Ax{=Bxj}, Axj=o. (19 


From (19) it follows that 
w° (A) = x5 — Ay tee t+ (— IP ey (%9 = % XO) 


is a non-zero solution of (14) of degree = h—1 < e, which is impossible 
Thus, the vectors (17) are linearly independent. 


2 We shall now show that the equation(A -+ AB) z =ohas no solution 
of degree !ess than e. ‘To begin with, we okserve that the equation LZ, y¥ = « 
like (7), has a non-zero solution of least degree «. We can see this imme 
diately, if we replace the matrix equation l, y=o by the system of ordinar 
equations 


Ay; + ¥o— 9, AY, + ¥3 =0, RS J AY, + Yo+t =0 (y= (%, Yo, sae y Ye41))3 


yp =(—1)F¥-ly,AP-1 (k= 1, 2,...,€4 1). 


§ 3. Srvcuuar Pencits. THE REDUCTION THEOREM 33 


On the other hand, if the pencil has the ‘triangular’ form (13) then the 
corresponding matrix pencil M, (k=0,1,....¢€) (see (10) and (10’) on 
pp. 29 and 30) ean also be brought into triangular form, after a suitable 
permutation of rows and columns: 


M,{L,] M,(D+ AF] 
( O Pare 


For k = ¢« — 1 all the columns of this matrix, like those of M._: [{Z.],are 
linearly independent.”? But M,_1 [L. Jis a square matrix of order e(e + 1). 
Therefore in M,_1 [A + AB] also, all the columns are linearly independent 
and, as we have explained at the beginning of the section, this means that the 
equation (A + AB) xz == o has no solution of degree less than or equal to e« — 1, 
which is what we had to prove. 


(20) 


3. Let us replace the pencil (13) by the strictly equivalent pencil 


(F ih ree B= a 


ae = ae , (21) 
O £,/\o A+AB E,} \o A+AB 


where E,, E>, E;, and E, are square unit matrices of orders e, m—e, € + 1, 
and n —«—1, respectively, and X, Y are arbitrary constant rectangular 
matrices of the appropriate dimensions. Our theorem will be completely 
proved if we can show that the matrices X and Y can be chosen such that the 
matrix equation 
L,X=D+AF+Y(A+ AB) (22) 
holds. 
We introduce a notation for the elements of DP, F, X and also for the 


rows of Y and the columns of A and B: 


D=||\da |, F=\|fall, X= || zal 
($=1,2,...,8; b=1,2,...,.n—e—1; j=1,2,...,e+1]), 


[Yr 
Y={° J, A=(Gy ay.) Open), B= (b1, bey -- +1 Opnaa) 
ve 
Then the matrix equation (22) can be replaced by a system of scalar equa- 


tions that expresses the equality of the elements of the k-th column on the 
right-hand and left-hand sides of (22) (k=1,2,...,n—e—1): 


11 This follows from the fact that the rank of the matrix (20) for k = & — 1 is equal 
to en; a similar equation holds for the rank of the matrix Me—, [Ze]. 


34 XII. Sincuuar PENCILS or MATRICES 


Boy + Any dy t+ Afy t+ yya, + Ayyd,, 
Tay + Ady, = dy + Afoy + Yo, + Ayd;,, 
Xp + Axg, = dg, + Afgy + YgQ, + Aysbz, (23) 


Vot1k 55 Aty =, 1 Att = Ya, 2 Ay.b, 
(k=1,2,...,n—e—l). 


The left-hand sides of these equations are linear binomials in A. The 
free term of each of the first e — 1 of these binomials is equal to the coeffi- 
cient of 4 in the next binomial. But then the right-hand sides must also 
satisfy this condition. Therefore 


Yd, — Yad, = fa — Ay, 
Yo, — Ysd, = fay — doy; 
sea sae Mas rey. Oh: a Ske as (24) 
Yo_1y — YoDy = fee — Tere 
(k=1,2,...,n—e—1). 


If (24) holds, then the required elements of X can obviously be determined 
from (23). 

It now remains to show that the system of equations (24) for the ele- 
ments of Y always has a solution for arbitrary dy and fx, (¢=1, 2,..., €; 
k=1,2,...,.—e-—1). Indeed, the matrix formed from the coefficients 
of the unknown elements of the rows y1, — Y2, Y3, — Ya, .-., can be written, 
after transposition, in the form 


: e-l 

A 0O...0 

BA 

o B ; 
A 

0 0...B 


But this is the matrix M,_.2 for the pencil of rectangular matrices A+4AB 
(see (10’) on p. 30). The rank of the matrix is (e —1) (n— e—1), be- 
cause the equation (A + AB) x = 0, by what we have shown, has no solutions 
of degree less than e«. Thus, the rank of the system of equations (24) is 
equal to the number of equations and such a system is consistent (non- 
contradictory) for arbitrary free terms. 


This completes the proof of the theorem. 


§ 4. CanonicaL Form or SINGULAR PENCIL 35 


§ 4. The Canonical Form of a Singular Pencil of Matrices 


1. Let 4+4AB be an arbitrary singular pencil of matrices of dimension 
m Xn. To begin with, we shall assume that neither among the columns 
nor among the rows of the pencil is there a linear dependence with constant 
coefficients. 

Let r < n, where r is the rank of the pencil, so that the columns of A + AB 
are linearly dependent. In this case the equation (A + /AB)z =o has a non- 
zero solution of minimal] degree €;. From the restriction made at the begin- 
ning of this section it follows that «, > 0. Therefore by Theorem 4 the 
given pencil can be transformed into the form 


 a.a8, 
O A,+AB,) 
where the equation (A, + 4B,) 2) =o has no solution x“) of degree less 
than €}. 
If this equation has a non-zero solution of minimal degree &2 (where, 


necessarily, €2 = &,), then by applying Theorem 4 to the pencil A, + AB, 
we can transform the given pencil into the form 


L., O O 
O L, oO }- 
O O A,+AB 


Continuing this process, we can put the given pencil into the quasi- 
diagonal form 
Le, O 


(25) 


\, 
O A, + AB, 


where 0 < 6; S &2 =... €, and the equation (A, + 4B,) af? =o has no 
non-zero solution, so that the columns of A, + AB, are linearly independent." 

If the rows of A,+AB, are linearly dependent, then the transposed 
pencil A} + AB) can be put into the form (25), where instead of £1, €2,..., €, 
there occur the numbers (0 <)m, Sym,S-+-S7n,.** But then the given 
pencil A+ 4B turns out to be transformable into the quasi-diagonal form 


12 In the special case where e1 + es +... + ep —m the block Ap + AB, is absent. 


12 Since no linear dependence with constant coefficients exists among the rows of the 
pencil 4 + AB and consequently of 4, + AB;, we have N: > 0. 


36 XII. Srneuuar PeNciLs or Matrices 


In, O 


Ln, (26) 


rnd 
0 ‘Ay + AB, 


(O<gS&S°Se&,, OS, SH S'' SM) 


where ooth the columns and the rows of A, + 4B, are |inearly independent, 
ie., Ao + ABp is a regular pencil.’® 


2. We now consider the general case where the rows and the columns of 
the given pencil may be connected by linear relations with constant coeffi- 
cients. We denote the maximal number of constant independent solutions 
of the equations 

(A+A4B)x=o0 and (A™+/B™)=0 


by g and h, respectively. Instead of the first of these equations we consider, 
just as in the proof of Theorem 4, the corresponding vector equation 
(4+ 41B)x=o0 (A and B are operators mapping R,, into R,,). We denote 
linearly independent constant solutions of this equation by e1, @2, ..., & 
and take them as the first g basis vectors in R,. Then the first g column: 
of the corresponding matrix A + AB consist of zeros 


F 
A+AB=(0, A, + AB,). (27) 


Similarly, the first h rows of the pencil A -F AB, can be made into zeros 
The given pencil then assumes the form 


B 
1[O oO YY. . 
( O Pee Nie 


18 ff ia the given penciu + —: 7, i.e, if the columns of the pencil are linearly independent 
then the first p diagonal blocks in (26) of the form L, are absent (p=0). In the sam 
way, if r=™m, i., 1f the rows of 4 -+AB are linearly independent, ther in (26) th 
diagonal blocks of the form Ly are absent (q=0). 


~ 


§ 5. Minrmau INDICES. CRITERION FOR STRONG EQUIVALENCE 37 


where there is no longer any linear dependence with constant coefficients 
among the rows or the columns of the pencil 4° + 24B°. The pencil A® + AB° 
can now be represented in the form (26). Thus, in the general case, the 
pencil A + AB can always be put into the canonical quasi-diagonal form 

g 


{8[0, Legysr -++1 Lup, Ltpyrs «++» Lng; Ay + Bo}. (29) 


The choice of indices for ¢ and 7 is due to the fact that it is convenient here 
to take ¢,= &g =" = £,=0 and 7,= 73 =": == 0. 

When we replace the regular pencil A, + AB, in (29) by its canonical 
form (6) (see § 2, p. 28), we finally obtain the following quasi-diagonal 
matrix ° 


g 
{»[0; Legs rs eae, Ley; Lanss oe 05 Ln; Nt), <8) NM); J + AB}, (30) 


where the matrix J is of Jordan normal form or of natural norma! form and 
N@® =e 4+ ,AH™., 

The matrix (30) is the cunonical form of the pencil A + AB in the most 
general case. 

In order to determine the canonical form (30) of a given pencil imme- 
diately; without carrying out the successive reduction processes, we shall, 
following Kronecker, introduce in the next section the concept of minimal 
indices of a pencil. 


§ 5. The Minimal Indices of a Pencil. Criterion for 
- Strong Equivalence of Pencils 


I. Let 4+AB be an arbitrary singular pencil of rectangular matrices. 
Then the & polynomial columns 2;(A), ze(A), ..., 2(A) that are solutions 
of the equation 

(A+ AB)z =0 (31) 


are linearly dependent if the rank of the polynomial matrix formed from 
these columns XY = [21(A), zo(A), ..., 2x(A)] is less than k. In that case 
there exist & polynomials p, (A), po(A), ..., px(A), not all identically zero, 
such that 


Py (A) 2, (A) + po (A) xy (A) ++°+ + p, (A) 2, (AD=O. 


But if the rank of X is k, then such a dependence does not exist and the 
solutions 21(A), Z2(A), ..., 2(A) are linearly independent. 


38 XII. SiInquLaR PENcILS OF MATRICES 


Among all the solutions of (31) we choose a non-zero solution 2;(A) of 
least degree €;. Among all the solutions of the same equation that are lin- 
early independent of 2,(A) we take a solution x2(A) of least degree €o. 
Obviously, ¢; = &2. We continue the process, choosing among the solutions 
that are linearly independent of x,(4) and x2(A) a solution z,(4) of minimal 
degree ¢;, ete. Since the number of linearly independent solutions of (31) 
is always at most n, the process must come to an end. We obtain a funda- 
mental series of solutions of (31) 


%(A), %2(A), «++, Zy(A) (32) 
having the degrees 


&SeS++-Se,. (33) 


In general, a fundamental series of solutions is not uniquely determined 
(to within scalar factors) by the pencil A+ AB. However, two distinct 
fundamental serves of solutions always have one and the same series of 
degrees €, €2,..., &€. For let us consider in addition to (32) another funda- 
mental series of solutions 7,(4), Z2(A), ... with the degrees &, @, .... 
Suppose that in (33) 


Ey = °° = bn, < Ent — °° En - . 


and similarly, in the series &1, €., 


eoeoey 


€y 29 = Fh, <i <i . 
Obviously, ¢,=&,. Every column 2,(A) (t= 1, 2,..., 7,) is a linear com- 
bination of the columns 2:(A), z2(A), ..., Zn(4), since otherwise the solu- 
tion a,41(A) in (32) could be replaced by 2;,(4), which is of smaller degree. 
It is obvious that, conversely, every column 2,(A) (¢=1, 2, ..., m) isa 
linear combination of the columns 2(A), 2(A), ..., 20,43 (A). Therefore 


M =, and &2,4, =&x,4+,;. Now by a similar argument we obtain that 
N2= Ne and En, + = ERtD ete. 


2. Every solution 2,(4) of the fundamental series (32) yields a linear 
dependence of degree ¢, among the columns of A+ AB (k=1, 2, ..., p) 
Therefore the numbers &1, 2, ..., &p are called the minimal indices for the 
columns of the pencil A + AB. 

The minimal indices 1, Ne, ---,, for the rows of the pencil A + 2B aré 
introduced similarly. Here the equation (A +AB)z=o is replaced by 
(AT + AB™)y =o, and 71, 2, ..., %q are defined as minimal indices for thi 
columns of the transposed pencil A™ + ABT 


§5. Minimat INpicEs. CRITERION FOR STRONG EQUIVALENCE 39 


Strictly equivalent pencils have the same minimal mdices. For let 
A+ AB and P(A+AB)Q he two such pencils (P and Q are non-singular 
square matrices). Then the equation (31) for the first pencil can be written, 
after multiplication on the left by P, as follows: 


P(A+AB)Q:-Q 12 =o. 


Hence it is clear that all the solutions of (31), after multiplication on the 
left by Q—?, give rise to a complete system of solutions of the equation 


P(A + AB) Qz=0. 


Therefcrre the pencils A+AB and P(A +AB)Q have the same minimal 
indices for the columns. That the minimal indices for the rows also coincide 
can. be established by going over to the transposed pencils. 

Let us compute the minimal] indices for the canonical quasi-diagonal 
matrix 


g 
(aro. Degas caey Liy; mas eens Lig» Ay +- 1B, (34) 


(A, + AB, is a regular pencil having the normal form (6) ). 

We note first of all that: The complete system of indices for the columns 
(rows) of a quasi-diagonal matriz is obtained as the union of the correspond- 
ing systems of minimal indices of the individual diagonal blocks. The matrix 
LD, has only one index « for the columns, and its rows are linearly independ- 
ent. Similarly, the matrix DL’, has only one index 7 for the rows, and its 
columns are linearly independent. Therefore the matrix (34) has as its 
minimal indices for the columns 


&,=++-=e,—0, fo+i ae ) ED 


and for the rows 
0 aac m9, Nhtis ser Noe 


We note further that DZ, has no elementary divisors. since among its 
minors of maximal order « there is one equal to 1 and one equal to 4‘. The 
same statement is, of course, true for the transposed matrix L%. Since the, 
elementary divisors of a quasi-diagonal matrix are obtained by combining 
those of the individual diagonal blocks (see Volume I, Chapter VI, p. 141), 
the elementary divisors of the A-matriz (34) coincide with those of its regular 
‘kernel’ A, + ABo. 

The cenonical form of the pencil (34) is completely determincd by the 
minimal wmdices &,..-,£),%1,---,% and the elementary divisors of the 
pencil or, what-is the same, of the strictly equivalent pencil A+AB. Since 


40 XII. Srnguvar PENcILS or MATRICES 


two pencils having one and the same canonical form are strictly equivalent, 
we have proved the following theorem: 


THEOREM 5 (Kronecker): Two arbitrary pencils A+ AB and A, + AB, 
of rectangular matrices of the same dimension m X nare strictly equivalent 
af and only if they have the same minimal indwes and the same (finite and 
infinite) elementary divisors. ~° : 


In conclusion, we write down, for purposes of illustration, the canonical 
form of a pencil A +AB with the minimal indices ¢;=0, é2=1. 6; =2, 
n1 = 0, n2=0, 7; = 2 and the elementary divisors j?, (A + 2)?, wu? :* 


2 ee ! 
AQ : 
(A 1 0 | 
Oa L 
A 0 | 
EAs r 
0.1 (35) 
ree 
01a 
ae 
A | 
oe | 
A+2 1 | 
(0 a2 | 
§ 6. Singular Pencils of Quadratic Forms 
1. Suppose given two complex quadratic forms: 
A (x, 2) = ~ Oy LXy B(x, x) = -. 6 % 0,5 (36) 


they generate a pencil of quadratic forms A(z,z) +AB(z,x). This pencil 
of forms corresponds to a pencil of symmetric matrices A+AB (AT=A 
B'=B). If wesubject the variables in the pencil of forms A(z,7) +AB(z,x) 
to a non-singular linear transformation += Tz (| T | 40), then the trans 
formed pencil of forms A(z,z) +AB(z,z) corresponds to the pencil o! 
matrices 


15 All the elements of the matrix that are not mentioned expressly are zero. 


§ 6. Smncuuar PENCILS oF QuaDRATIC FoRMS 41 


A+AB=T(A+AB)T ; (37) 
here T is a constant (i.e., independent of 4) non-singular square matrix of 
order n. 

Two pencils of matrices A + AB and A + AB that are connected by a rela- 
tion (36) are called congruent (see Definition 1 of Chapter X ; Vol. I, p. 296). 
Obviously, congruence is a special case of equivalence of pencils of 
matrices. However, if congruence of two pencils of symmetric (or skew- 
symmetric) matrices is under consideration, then the concept of congruence 
coincides with that of equivalence. This is the content of the following 
theorem. 


THEOREM 6: Two strictly equivalent pencils of complex symmetric (or 
Skew-symmetric) matrices are always congruent. 

Proof. Let A=A+AB and A=A+AB be two strictly equivalent 
pencils of symmetric (skew-symmetric) matrices: 


A=PAQ (A7=44, AT=44; |Pl¥0, |Q|40). (38) 


By going over to the transposed matrices we obtain: 


A=Q'AP*. (39) 
From (38) and (39), we have 
AQP =P1Q°A. (40) 
Setting 
U=QP"—, ° (41) 
we rewrite (40) as follows: 
AU=U'A: (42) 


From (42) it follows easily that 


AUE=U"A (k=0,1, 2,..,) 
and, in general, 


AS=S‘A, (43) 
where 


S=}(U), (44) 


and f(A) is an arbitrary polynomial in 4. Let us assume that this poly- 
nomial is chosen such that | §| 40. Then we have from (43) : 


A= SAS, (45) 


42 AIT. Sincuuar PENcILS or MatTRICES 


Substituting this expression for A in (38), we have: 
A= PS'AS—Q. (46) 


If this relation is to be a congruence transformation, the following equa- 
tion must be satisfied : 
(Ps")"=SQ, 
which can be rewritten as 
8=—-QPTI1= 


Now the matrix S=f(U) satisfies this equation if we take as f(A) the 
interpolation polynomial for va on the spectrum of U. This can be done, 
because the many-valued function 2 has a single-valued branch determined 
on the spectrum of U, since | U | +0. 

The equation (46) -now becomes the condition for congruence 


A=TAT (T=SQ=/VQP"™—Q).. (47) 


From this theorem and Theorem 5 we deduce: 


CoROLLARY: Two pencils of quadratic forms 
A(z,x)+AB(z, 2x) and A(z, z) + AB (2, z) 


can be carried into one another by a transformation x=Tz (| T| #0) of 
and only if the pencils of symmetric matrices A + AB and A+ AB have the 
same elementary divisors (finite and infinite) and the same minimal indices. 


Note. For pencils of symmetric matrices the rows and columns have the 
same minimal indices: 


P= & = M1. +02 Ep My- (48) 


2. Let us raise the following question: Given two arbitrary complex quad- 
ratec forms 


R 
A (x, 2) = D> Ay LX, » B (x, x) = D> by; jy 
t, kan] 


ikol 


Under what conditions can the two forms be reduced simultaneously to 
sums of squares 


D> az; and Dir (49) 


t=l 


by a non-singular transformation of the variables x= Tz (| 7'| 40)? 


§ 6. SmneuLar PENCILS OF QuaADRATIC FORMS 43 


Let us assume that the quadratic forms A(z,zx) and B(z,x) have this 
property. Then the pencil of matrices A+ AB is congruent to the pencil 
of diagonal matrices 


{a, + Ab;, As ++ Abo, ooo, G, + Ab, . (50) 


Suppose that among the diagonal binomials a, + Ab, there are precisely r 
(r =n) that are not identically zero. Without loss of generality we can 
assume that 


a, = b= 0,..., Gy_,= b,_,=90, a4 AO (C= n—rt]1,...,n). (51) 


Setting 


Ay + ABy = {@,-942 + Non +419 ees a, + Ab,} ? (52) 


we represent the matrix (51) in the form 


nr-f 


{O, A y+4Bp}. (53) 


Comparing (52) with (34) (p.39), we see that in this case all the minimal 
indices are zero. Moreover, all the elementary divisors are lincar. Thus 
we have obtained the following theorem : 


THEOREM 7: Two quadratic forms A(z,r) and B(x,:r) can be reduced 
semultaneously to sums of squares (49) by a transformation of the variables 
if and only if in the pencil of matrices A +-AB all the elementary divisors 
(fonete and infinite) are linear and all the minimal indices are zero. 


In order to reduce two quadratic forms A(z,2) and B(x, r) simulta 
neously to some canonical form in the general case, we have to replace the 
pencil of matrices A + AB by a strictly equivalent ‘canonical’ pencil of 
symmetric matrices. 

Suppose the pencil of symmetric matrices A + AB has the minimal indices 


&,)=...=e@,=0, 6,4, 40, ..., &) 40, the infinite elementary divisors 
pe’, ws, ..., ue" and the finite ones (A + A,)*, (A+ A,)%,...,(A+ 4)% Then, 
in the canonical form (30), 9 =h, p=q and €,,;=M+1,---, f="). We 


replace in (30) every two diagonal blocks of the form ZL, and L, by a single 


diagonal block (7 - and each block of the form N“ = E™ + AH ™)by the 


44 AIT. Srncuuar PENcius or Matrices 


strictly equivalent symmetric block 


0.0...0 11 

00...0 1 00...1 a 
Nw=pwyw=|]0 O---1 At oa pee 5 TH. (84) 

Ie s420'6 - 1 | 

1 0...0 0| 


Moreover, instead of the regular diagonal block J +4E in (30). (J is a 
Jordan matrix) 


J+ AH=((A4+/,) BO + HO, ..., (A+ 2,) BO + HO), 
we take the strictly equivalent block 
{ZS ..., ZO} , (55) 
where 


ZO—= VOLTA + A) BO + HO} 


O... 0 AFAl 
0... Ata] 


= a ° - (¢=1,.2,...,8. (56) 
Ata 1....0 | 
The pencil A + AB is strictly equivalent to the symmetric pencil 
A+AB 


Oo OL, O Ty & 2 
=o, (; ne eg NM), ..., No; z,..., ap. (57) 
Co+2 Sp 


Two quadratic forms with complex coef ficients A(x,x) and B(x, x) can 
be sumultaneously reduced to the canonical forms A(z,2) and B(z,2) defined 
in (57) by a transformation of the variables x = Tz (| T | #0). 


17 In the Russian edition the author stated that ppopositions analogous to Theorems 6 
and 7 hold for hermitian forms. A. I. Mal’cev has pointed out to the author that this is 
not the case. As regards singular pencils of hermitian forms, see [197 II]. 


§ 7. APPLICATION TO DIFFERENTIAL EQuaTIONS 45 


§ 7. Application to Differential Equations 


1. The results obtained will now be applied to a system of m linear differ- 
ential equations of the first order in m unknown functions with constant 
coefficients :1® 


s , a2 ; 
Zt t Sug =hO (G=1,2,....m), (58) 
= =] 
or in matrix notation: 
Ax + B=} (0); (59) 
here? ys 
A=|\ag||, B=||by|| (¢=1,2,...,m; =1,2,...,%), 
B= (Ly, Layee Za), f= (fis fey -++s bmd- 
We introduce new unknown functions 2), 22, ..., 2, that are connected 
with the old 2;, 22, ..., 2, by a linear non-singular transformation with 
constant coefficients: 


w=Q2 (2=(2,, Zo, .--5 %)3 [Q|¥O0). (60) 


Moreover, instead of the equations (58) we can take m arbitrary inde- 
pendent combinations of these equations, which is equivalent to multiplying 
the matrices A, B, f on the left by a square non-singular matrix P of order m. 
Substituting Qz for z in (59) and multiplying (59) on the left by P, we 
obtain : 


Ae BE =f, (61) 
where _ a . a . 
A=PAQ, B=PBQ, f=Pf=(hyfe--+>fn- (62) 
The matrix pencils A + AB and A+4AB are strictly equivalent: 
A+AB=P(A+4B)Q. (63) 


We choose the matrices P and Q such that the percil A+B has the 
canonical quasi-diagonal form 


18 The particular case: where m= n and the system (58) is solve 1 with respect to the 
derivatives has been treated in detail in Vol. I, Chapter V, § 5. 

It is well known that a system of linear differential equations witn constant coeffi- 
cients of arbitrary order s can be reduced to the form (58) if all the derivatives of the 
unknown functions up to and including the order s — 1 arc included as additional unknown 
functions. 

19 We recall! that parentheses denote column matrices. Thus, z= (2, 22,..., %n) is the 
column with the elements 21, za, ..., an. ; 


46 XII. Srinquvar Pencits or Matrices 


A + AB={O, Legsas +0) Leps nays». + LE, NO, ..., NO, J+ AB). (64) 


In accordance with the diagonal blocks in (64) the system of differential 
equations splits into v= p —g + q—h+s + 2 separate systems of the form 


a 
O-z=f, (65) 
lit 
d\1+i ~~ J 
Luly) ® =F (é=1,2,...,p—g), (66) 
+ /d\p-ostsy PO%ntt 
Ini (q)’ 2 = I (G=1,2,...,9—4), (67) 
aa P—9ig-h+1+k 
won (5)” rte (k=1,2,...,8), (68) 
d\* * 
(J+5)2=7, (69) 
where 
1 
z f 
2 : 
a ~ f 
z=—> . 7, f=— . ft, (70) 
: ’ 
“\f 
1 ae. oe ~ 2 ae 
z= {2}, ’ Z,), f=( ) ae tds 2 = (Za. =e ) f=(fha. ei .) etc., (71) 
A(),)=44+ BR, if A(a)=A+AB. (72) 


Thus, the integration of the system (59) in the most general case is 
reduced to the integration of the special systems (65)-(69) of the same type. 
In these systems the matrix pencil 4 + AB has the form 0, L,, L,, N“ , and 
J +iAE, respectively. 


1) The system (65) is not inconsistent if and only if 


1.€., 
f,=0,...,/,=0. (73) 


§ 7. APPLICATION TO DIFFERENTIAL EQUATIONS 47 


In that case we can take arbitrary functions of ¢ as the unknown functions 
1 
21, 22, ..., 2g that form the columns z. 
2) The system (66) is of the form 


d 2 
L, (5) z=} (74) 
or, more explicitly,”° 
dz, az, : 
at +2,= Aw, © 4a fe(t), ee Sot Zens = fall). (75) 


Such a system is always consistent. If we take for z.,,(¢) an arbitrary 
function of ¢, then all the remaining unknown functions 2, , 2,-1,..., 21 can 
be determined from (75) by successive quadratures. 


3) The system (67) is of the form 
t/d ~ 
L, (5) z=f (78) 
or, more explicitly,”? 
dz, 7 he ~ 
nah, at 2 1 icnss re + ty = f(t), a=fntilé). (77) 


From all the equations (77) except the first we determine 2,, 2)-1,..., 21 
uniquely : 


Zn nti 
~ df. 
2n-1 7 hi a (78) 
~ d 
ay =f.— a pe e+ (— 1)"~1 canine 


Substituting this expression for z, into the first equation, we obtain the 
condition for consistency : 


j, — hs 4 Oe og (Lay SH 9, (79) 


20 We have changed the indices of 2 and f to simplify the notation. In order to return 
from (75) to ne we have to replace ¢ by e: and add to each index of 2 the number 
9 + &y41 + +++ +e Egt+i—rti—l, to each index of f the number h +e Egti tes st egti—i - 

21 Here, as in the preceding case, we have changed the notation. See the preceding 
footnote. 


48 XII. SinauuarR PENcILs or Matrices 


4) The system (68) is of the form 


d a 
Nm) (5) 2= f e (80) 
or, more explicitly, 
a ~ d oe dz aa ners _* 
tah s2 + 2. = fe: gees Tri toon Pee aaa Oe (81) 


Hence we determine successively the unique solutions 


ufos 
aes df. 
es | =ha-a ’ (82) 
dj, a wn he 
% jth 4 th ee 1) any 


Je+ >=}. (83) 


As we have proved in Vol. I, Chapter V, § 5, the general solution of such 
a system has the form 


i 
2 =eTbay + { e~7¢-9 f(t) dt; (84) 
6 


here z is a column matrix with arbitrary elements (the initial values of 
the unknown functions for t=0). 

The inverse transition from the system (61) to (59) is effected by the 
formulas (60) and (62), according to which each of the functions 21, ..., Xp, 
is a linear combination of the functions 21,..., 2, and each of the functions 
fi(t),..., fm(t) is expressed linearly (with constant coefficients) in terms 
of the functions f:(¢),..., fm(t). 


2. The preceding analysis shows that: In general, for the consistency of the 
system (58) certain well-defined linear dependence relations (with constant 
coefficients) must hold among the right-hand sides of the equations and the 
derivatives of. these right-hand sides. 

If these relations are satisfied, then the general solution of the system 
contains both arbitrary constants and arbitrary functions linearly. 

The character of the consistency conditions and the character of the 


§ 7. APPLICATION TO DIFFERENTIAL EQUATIONS 49 


solutions (in particular, the number of arbitrary constants and arbitrary 
functions) are determined by the minimal indices and the elementary div1- 
sors.of the pencil A + AB, because the canonical form (65)-(69) of the sys- 
tem of differential equations depends on these minimal indices and ele- 
mentary divisors. 


CHAPTER XIII 


MATRICES WITH NON-NEGATIVE ELEMENTS 


{n this chapter we shall study properties of real matrices with non-negative 
clements. Such matrices have important applications in the theory of 
probability, where they are used for the investigation of Markov chains 
(‘stochastic matrices,’ see [46]), and in the theory of small oscillations of 
elastic systems (‘oscillat'-n matrices,’ see [17]). 


§ 1. General Properties 


l. We begin with some definitions. 
DEFINITION 1: A rectangular mairix A with real elements 
A= || az || (*=1,2,...,m;k=1, 2,..., 0) 


as called non-negative (notation: A = O) or positive (notation: A > O) if 
all the elements of A are non-negative (ay, = 0) or positive (aux > 0). 

DEFINITION 2: A square matriz A= | Aix ik as called reducible if the 
ondex set 1,2,..., n can be splet into two complementary sets (without com- 
Mon Ndrces) 4, 12)... tus hi, he, ..., ky (utv=n) such that 


xg —9 (G1, 2s PH, 2) 224 8): 


Otherwise the matrix is called trreducibdle. 

By a permutation of a square matrix A = | Dix i we mean a permutation 
of the rows of A combined with the same permutation of the columns. 

The definition of a reducible matrix and an irreducible matrix can also 
be formulated as follows: 


DEFINITION 2’: A matrix A= | Ak ||z as called reducible if there is a 
permutation that puts it into the form 
~ {BO 
i=( 5} 
C D 


where B and D are square matrices. Otherwise A is called irreducible. 


90 


_ 


§1. GENERAL PROPERTIES 5D] 


Suppose that A= ai, ||t corresponds to a linear operator A in an 
n-dimensional vector space R with the basis e;, €2,...,@,. Toa permutation 
of A.there corresponds a renumbering of the basis vectors, i.e., a transition 
from the basis e;, €2,..., @, to a new basis e’, =e, eg = €,,,..., @, =Cin, 
where (j1, jo,..., Jn) iS @ permutation of the indices 1, 2,...,”. The matrix 
A then goes over into a similar matrix A=T~'!AT. (Each row and each 
column of the transforming matrix 7 contains a single element 1, and the 


remaining elements are zero.) 
2. By a v-dimensional coordinate subspace of R we mean a subspace of R 
with a basis e,,, e,,,--., &, (lShki <hko<...< ky Sn). There are (*) 


v-dimensional coordinate subspaces of R connected with a given basis 
€1, @2,...,@,. The definition of a reducible matrix can also be given in the 
following form: 


DEFINITION 2”: A matriz A= | Oak IK is called reducible if and only if 
the corresponding operator A has a v-dimenstonal invariant coordinate sub- 
‘ce with y <n. 


We shall now prove the following lemma: 


Lemma 1: If A=O is an irreducible matriz of order n, then 
(E+ A)*-1>0. (1) 


Proof. For the proof of the lemma it is sufficient to show that for every 
vector’ (i.e., column) y = o (y #0) the inequality 


. (E+ A)*-ly> 0 
holds. 

This inequality will be established if we can only show that under the 
conditions y = 0 and y x0 the vector z= (E+ A)y always has fewer zero 
coordinates than y does. Let us assume the contrary. Then y and z have 
the same zero coordinates.?, Without loss of generality we may assume that 
the columns y and z have the form® 


1 Here and throughout this chapter we mean by a vector a column of 2 numbers. [In 
this way we identify, as it were, a vector with the column of its coordinates in that basis 
jn which the given matrix 4 = | ace [& determines a certain linear operator. 


2 Here we start from the fact that z—=y-+ Ay and Ay = 0; therefore to positive 
coordinates of y there correspond positive coordinates of 2. ’ 


3 The columns y and z2 can be brought into this form by means of a@ suitable renumber- 
ing of the coordinates (the same for y and 2). 


52 XIII. MATRICES WITH NON-NEGATIVE ELEMEINTS 


v=(2), 2=(2) (“> 0, v>o), 


where the columns u and v are of the same dimension. 


Setting 
AA A 
A =( 11 12 
\Ao, Ae 
we have 
(Car Ze(c)=(¢) 
O Ay, Az) \o o/° 
and hence 
A,u=o. 
Sinee u > a, it follows that 
A, =O. 


This equation contradicts the irreducibility of A. 
Thus the lemma is proved. 
We introduce the powers of A: 


At =a? || z (q =1, 2,...). 


Then the lemma has the following corollary : 


COROLLARY: If A = O ts an trreducible matriz, then for every index 
pari,k (1 Stk =n) there exists a positive enteger gq such that 


a? > 0.- (2) 


Moreover, q can always be chosen within the bounds 
Gqe=m—1 ftiFk, ie 
sa AAG 3) 
g=m sf +k, 


where m vs the degree of the minimal polynomial y(A) of A. 

For let r(A) denote the remainder on dividing (A +1)*—1! by w(A). 
Then by (1) we have r(A) > O. Since the degree of r(A) is less than m, 
it follows from this inequality that for arbitrary z4,k (1 D24,k =72) at least 
one of the non-negative numbers 


2 m—1) 
bu, Qu» an’, ooo > af, 


is not zero. Since 6,,—0O for «4, the first of the relations (3) follows. 


§ 2. SPECTRAL PROPERTIES OF IRREDUCIBLE NON-NEGATIVE Matrices 53 


The other relation (for i=) is obtained similarly if the inequality 
r(A) > O is replaced by Ar(A) > O.* 

Note. This corollary of the lemma shows that in (1) the number n — 1 
can be replaced by m — 1. 


§ 2. Spectral Properties of Irreducible Non-negative Matrices 


1. In 1907 Perron found a remarkable property of the spectra (1.e., the 
characteristic values and characteristic vectors) of positive matrices.® 


THEOREM 1 (Perron): A positive matriz A= | yi; iki always has a real 
and positive characteristic value r which is a simple root of the characteristic 
equation and exceeds the modult of all the other characteristic values. To this 
‘maximal’ characteristic value r there corresponds a characteristic vector 
2== (21, Z2,..., 2n) Of A with positive coordinates 2, > 0 (t= 1, 2,..., n).8 

A positive matrix is a special ease of an irreducible non-negative matrix. 
Frobenius’ has generalized Perron’s theorem by investigating the spectral 
properties of irreducible non-negative matrices. 


THEOREM 2 (Frobenius): An irreducible non-negative matrix A= 
| ui (|? always has a positive characteristic value r that is a simple root of 
the characteristic equation. The moduli of all the other characteristic values 
do not exceed r. To the ‘maximal’ characteristic value r there corresponds 
a characteristic vector with positive coordinates. 

Moreover, if A has h characteristic values Ag =7,A1,...,An—1 of modulus 
r, then these numbers are all distinct and are roots of the equation 


Js oh =0 (4) 


More generally: The whole speotrym do, A1, ..., An—1 Of A, regarded as a 
system of points in the complex A-plane, goes over into ttself under a rotation 


4 The product of an irreducible non-negative matrix and a positive matrix is itself 
positive. 

5 See [316], [317], and [17], p. 100. 

6 Since r is a simple characteristic value, the characteristic vector z belonging to it is 
ietermined to within a sealar factor. By Perron’s theorem all the coordinates of z are 
‘eal, different from zero, and of like sign. By multiplying 2 by —1, if necessary, we 
‘an make all its coordinates positive. In the Jatter case the vector (column) z= (1, 22; 
8 +++, 2n) is called positive (as in Definition 1). 

7 See [165] and [166]: 


54 XIII. Matrices wirr Non-Necative ELEMENTS 


of the plane by the angle 2a/h. If h > 1, then A cun be put dy means of a 
permutation into the following ‘cyclic’ form: 


O Age -<agD 
00 Ay... 0 
A= ee ee ae | (5) 
000 ss Anan f 
4,00 ...0 


/ 


where there are square blocks alony thr main diagonal. 


Since Perron’s theorem follows as a special case from Frobenius’ 
theorem, we shall only prove the latter. To begin with, we shall agree on 
some notation. 

We write 

C=DordD=C, 


where C and D are real rectangular matrices uf the same dimensions m X n 


C=|{¢x |[, D=||d,|| (@=1, 2,...,m; k=1, 2,..., 2), 


if and only if 
Cz cd, (t= 1,2,...,.m; kK=1,2,..., 2). (6) 


If the equality sign can be omitted in ail the :nequalities (6), then we 
shall write 
C<DorDdD>C. 


In particular, C=O (> O) means that all the elements of C are non- 
negative (positive). 

Furthermore, we denote by (+ the matrix mod C which arises from C 
when all the elements are replaced by their moduli. 


2. Proof of Frobenius’ Theorem:® Let r= (4r1, wo,.--. %n) (1 OO) bea 
fixed real: vector. We set: 
, (1x); wr : 
r, = min ((Ax), = 2 @y,%,: *— 1, 2,..., ny. 
1lsign ad ‘Koel 


In the definition of the minimum we exclude here the values of 2 for which 
z,==0. Obviously r, = VU, and r, is the largest reat number e for which 


ox = Ax. 


£For a direct proof of Perron’s theorem see [17], p. 100 f/ 
®This proof is due to Wielandt [384]. 


§ 2. SprecTRAL PROPERTIES OF IRREDUCIBLE NON-NEGATIVE Matrices 55 


We shall show that the function r, assumes a maximum value r for some 
vector z= oa: 


« a7 . (Az); 
r=—r,=maxr,=max min —". 


(z20) =(zBo)istsn 


(7) 


From the defirition of r, it follows that on multiplication of a vector 
x =o (x0) by a number 1 > 0 the vaiue of r, does not change. There- 
fore, in the computation of the maximum of r, we can restrict ourselves to 
the closed set M of vectors x for which 


‘ n 
x2o and (zz) = 7 =], 

If the function r, were continuous on M, then the existence of a maximum 
would be guaranteed. However, though continuous at every ‘point’ xz > 9, 
r, may have discontinuities at the boundary points of M at which one of its 
coordinates vanishes. Therefore, we introduce in place of M the set NW of 
all the vectors y of the form 


y= (E+A)"—12 (re M). 


The set N, like M, is bounded and closed and by Lemma 1 consists of 
positive vectors only. 
Moreover, when we multiply both sides of the inequality 


r,t S Ag, 


by (£ + A)*—! > O, we obtain: 
roy = Ay (y=(E+ A)*—!2). 
Hence, from the definition of r, we have 
T_ = Ty. 


Therefore in the computation of the maximum of r, we can replace M 
by the set N which consists of positive vectors only. On the bounded and 
closed ses NV the function r, is continuous and therefore assumes a largest 
value for some vector z > 0. ° 

Every vector z = o for which 

r,=P (8) 
will be called extremal. 


56 XIII. Matrices with Non-NEGATIVE ELEMENTS 


We shall now show that: 1) The number r defined by (7) ts positive 
and 1s a characteristic value of A; 2) Every extremal vector z is positive 
and is a characteristic vector of A for the characteristic value r, 1.e., 


r>0, z2>0, Az=rz. (9) 
For if w= (1, 1, ..., 1), then r, = min > %y. But tnen r, >0, be- 
EEE, <PeeTE Ee 


lstsnkwl 
® 8 o7 e ° 
cause no row of an irreducible matrix can consist of zeros only. Therefore 


r> 0, since r=r,. Now let 
x=(H+ A)™"2z. (10) 


Then, by Lemma 1, z> 0. Suppose that Az—rz yo. Then by (1), (8), 
and (10) we obtain successively : 


Az—rz20, (E + A)*—! (Az—1z)>0, Ax —rz>o. 


The last inequality contradicts the definition of r, because it would imply 
that Az — (r+e)x>o for sufficiently small e >0, ie, rr-2rtedr. 
Therefore Azg=rz. But then 
o<a2=(H+A)*'*2z=(14r)*"z, 
so that 2 > 0. 
We shall now show that the moduli of all the characteristic values do not 


exceed r. Let 
Ay=ay (yo). (11) 


Taking the moduli of both sides in (11), we obtain :*° 


lalyt+sAyt. (12) 
Hence 


jals Ty+ Sr. 
Let y be some characteristic vector corresponding to r: 
Ay=ry (yo). 


Then setting a=r in (11) and (12) we conclude that y+ is an extremal 
vector, so that y+ > 0,i.e., y= (Y1, Yo,.--, Yn), Where ys 40 (141, 2,..., 0). 
Hence it follows that only one characteristic direction corresponds to the 
characteristic vector; for if there were two linearly independent character- 
istic vectors z and z,, we could chose numbers ¢ and d such that the char- 
acteristic vector y = cz + dz, has a zero coordinate, and by what we have 
shown this is impossible. 


10 Regarding the notation y+, see p. 54. 


§ 2. SPECTRAL PROPERTIES OF IRREDUCIBLE NON-NEGATIVE Matrices 57 


We now consider the adjoint matrix of the characteristic matrix AE — A: 
, B (A) = || By (A) | = 4 (A) (AB — A), 


where 4(A) is the characteristic polynomial of A and B,(A) the algepraic 
complement of the element Ad,4— a, in the determinant 4(A). From the 
fact that only one characteristic vector z= (2, 22, ..., 2n) with 2, >0, 
22 > 0, ..., 2n > 0 corresponds to the characteristic value r (apart from a 
factor) it follows that B(r) ~ O and that in every non-zero column of B(r) 
all the elements are different from zero and are of the same sign. The same 
is true for the rows of B(r), since in the preceding argument A can be re- 
placed by the transposed matrix A‘. From these properties of the rows 
and columns of A it follows that all the By(r) (4, k=1, 2,..., ”) are 
different from zero and are of the same sign oc. Therefore 


oA’ (r)=0 2 By(r)>0, 
=} 


i.e., A’(r) 0 and ris a simple root of the characteristic equation A(A) = 0. 
Since r is the maximal root of 4(A) =A" +..., M(A) increases for 
A=r. Therefore A’(r) > 0 and o=1, ie, 


By(r)>0 (t,&=1,2,..., 0). (13) 


3. Proceeding now to the proof of the second part of Frobenius’ theorem, 
we shall make use of the following interestipg lemma: 


Lemma 2: If A= | Quix fir and C= | Cék |? are two square matrices of 
the same order n, where A ts irreducible and’? 


C+ SA, (14) 


then: for every characteristic vector y of C and the maximal characteristic 
vector r of A we have the inequality 


ly|sr. (15) 
In the relation (15) the equality sign holds tf and only if 
C=e*DAD—!, (16) 


where e'?=y/r and D is a diagonal matrix whose diagonal elements are of 
umt modulus (D+ =E). 


11 See [384]. 
12 ¢ is a complex matrix and A = O. 


58 XULL. Mavrices witht NON-NEGATIVE ELEMENTS 


Proof. We denote by y a characteristic veetor of C curresponding to the 
characteristic value y: 
Cy= yy (y #0). a7) 


From (14) and (17) we find 


lyjytsctytsAy*. (18) 
Therefore 
ly| Sree. 
Let us now examine the case | y|=7r in detail. Here it follows from 


(18) that y* is an extremal vector for A, so that y+ > and that yt isa 
characteristic vector of 4A for the characteristic value r. Therefore the 
relation (18) assumes the form 


Ayt=COtyt=ryt, yt>o. (19) 
Hence by (14) 
CTt=A. (20) 
Let y = (y1, Ya, ---, Yn), Where 


y, =| y;,| ef G12; xixs 2) 
We c fine a diagonal matrix D by the equation 


D={ e, fs, ..., cfm}, 


Then 
y= Dy". 


Snbstituting this expression for y in (17) and then setting » = rc'?, we 
find easily : 
Fy? = ry", (21) 
where 
F=e-"D-" CD. (22) 


Comparing (19) with (21). we obtain 


Fyt=Ctyt =Ayt 
yr =Cty Ay". (23) 


But by (22) and (20) 
FY=Ct=A, 


$2. SpecrrRaL PROPERTIES OF IRREDUCIBLE NON NEGATIVE MATRICES 59 


Therefore we find from (23) 
Fy* = Ft y*. 


Since yt > oa, this equation can hold only if 


F=F*, 
1.€., 
eD-ICD =A. 
Hence 
C=e"DAD—', 


and the Lemma 1s proved. 


4. We return to Frobenius’ theorem and apply the lemma to an irreducible 
matrix A = O that has precisely h characteristic values of maximal modu- 
lus r: 

Ay =re'*, A, =re'™, ...,4,_,=re Pat 
(0= G0 <G1<G2< °° <P <2). 


Then, setting C=A and y=4A, in the lemma, we have, for every k =O, 
Ll ceg ed, 
A= &%D, AD,", (24) 


where D, is a diagonal matrix with Di= E. 
Again, let z be a positive characteristic vector of A corresponding to the 
maximal characteristic value r: 


Az=rz (z>0). (25) 
Then setting 
k k 
y =D,z (y+ =2z> o), (26) 
we find from (25) and (26): 
Ay=Ay (A, =re*t; k=0,1,...,4—D). (27) 
The last equation shows that the vectors y, y, Su fy defined in (26) are 
characteristic vectors of A for the characteristic values do, 41, ... , An—1. 
From (24) it follows not only that 4,=7, but also that each character- 
istic value 4;,..., Ax: of Ais simple. Therefore the characteristic vectors 


y and hence the matrices D,; (k =0,1,..., 4#—1) are determined to within 
scalar factors. To define the matrices Do, D;, ..., Dx—1 uniquely we shall 
choose their first diagonal element to be 1. Then D,.=E and y=z > 0. 


60 XJ{I. Matrices wItH NON-NEGATIVE ELEMENTS 


Furthermore, from (24) it follows that 
A = ef (% = %) D.DE'ADE *D;* (j, k=0,1,...,4—1). 
Hence we deduce similarly that the vector 
D,Dé?z 
is a characteristic vector of A corresponding to the characteristic value 


pel Ope ek). 

Therefore e+?) coincides with one of the numbers e and the matrix 
D, D,*1 with the corresponding matrix D,; that is, we have, for some 
4, (OS, k= h—1) 


ef (P7+PE) = efP1,,_ ef (4-9) = fF), 
= 1__ 
DD, =D,,, D;D,- = D,. 

Thus: The numbers e, e, ..., ei'—1 and the corresponding diagonal 
matrices Do, Di, ..., Dx—1 form two tsomorphic multiplicative abelian 
groups. 

In every finite group consisting of h distinct eleménts the h-th power 
of every element is equal to the unit element of the group. Therefore 
ef, eM, .., efMa-1 are h-th roots of unity. Since there are h such roots of 
unity: and gp= 0< 9g, < og <°**<Gy_1<22, 


= (k=0, I, 2, reg hl) 
and 
a 
ewi= gt (g=—eM=e 4; “L=0,1,...,4—1), (28) 
A,=re (k=0,1,...,4—1). (29) 
The numbers do, 41, ..., A4n-1 form a complete system of roots of (4). 
In accordance with (28), we have :?* 
D,=DF (D=D,; k=0,1,...,4—1). (30) 


The equation (24) now gives us (for k=1): 


a 


A=e *DAD—. (31) 


14 Here we use the isomorphism of the multiplicative groups ec, e%1,,.. ,e'%_1 and 
Do, Di, - ++ Dp-1. 


§ 2. SPECTRAL PROPERTIES OF JRREDUCIBLE Non-NEGATIvE Matrices 61 
22 ; 
Hence it follows that the matrix A on multiplication by e*” goes over inte a 


similar matrix and, therefore, that the whole system of n characteristic 
2n 


values of A on multiplication by eh goes over into itself. 
Further, 
D=E, 


so that all the diagonal elements of D are h-th roots of unity. By a permuta- 
tion of A (and similarly of D) we can arrange that D be of the following 
quasi-diagonal form: 


D={ Eo, MB, «++ Ne Eg } (32) 
where Ey, H;,..., He—1 are unit matrices and 
p= EP, Po= Ny . 
(ny is an integer; p=0,1,...,8s—1;0 << mm <<... << m1 <A). 


Obviously s Sh. 
Writing A in block form (in accordance with (32) ) 


A=[o dp (33) 
yy eee 


ve replace (31) by the system of equations 


— "M-1 on 
tng = ma Pe ( es eee rd (34) 
dence for every p and gq either a =e or Ap, = 0. 
vo 
Let us take p=1. Since the matrices A,2, Ay3,..., Ai, cannot vanish 
nh 7 Hel 


imultaneously, one of the numbers (jo ==1) must be equal 


a) 


No’ No? 70 
oe. This is only possible for »; = 1. Then? =eand 4,,—A);,;=...= 


414.—0. Setting p = 2 in (34), we find similarly that nz = 2 and that Ay, = 
Loo = Ay =...= Ao, =O, ete. Finally, we obtain 


15 The number h is the largest integer having these properties, because A has precisely 
characteristic values of maximal modulus r. Moreover, it follows from (31) that all 
he characteristic values of the matrix fal) into systems (with h numbers in each) of the 
OTM flgs Hols +++ > bye?) and that within each such system to any two characteristic 
alues there correspond elementary divisors of equal degree. One such system is formed 
y the roots of the equation (4) As, As, ..-)» Anes. 


62 XIII]. Matrices with NON-NEGATIVE ELEMENTS 


0 00...A 
As Bs Mion: A 


Here ny=1, ne=2, ..., n,-y7=8s—1]. But then for p=s on the right- 
hand sides of (34) we have the factors 


n (@—«) =i 
Wat =e ) (q=—1,2, ..., 8). 


on 
One of these numbers must be equal to e=e's. This is only possible when 
s=h and q=1; consequently, A,po=...=A,,= O. 

Thus, 


D={K), eH, eH, ..., & 1 E,_,}, 


and the matrix A has the form (5). 
Frobenius’ theorem is now completely proved. 


5. We now make a few general comments on Frobenius’ theorem. 


Remark 1. In the proof of Frobenius’ theorem we have established 
incidentally that for an irreducible matrix A = O with the maximal charac- 
teristic value r the adjoint matrix B(A) is positive for A=r: 


Bir) > 0, (35) 
1.€., 


By()>0 (,k= —..., 0), (35’) 


where By,.(r) is the algebraic complement the element r6,;— a,; in the 
determinant | rE — A |. 
Let us now consider the reduced adjoint matrix (see Vol. I, Chapter IV, 
§ 6) 
B (A) 
Dy-1 (4) : 


C(A)= 


where D,_,(A) is the greatest common divisor of all the polynomials B,,.(A) 
(1,4 =1,2,...,n). It follows from (35’) that D,_,(r) +0. All the roots 
of D,-1(A) are characteristic values’® distinct from r. Therefore all the 


16 Dn-1(A) is a divisor of the characteristic polynomial Da(A) == |AE-— A |. 


§ 2. SpectraL PROPERTIES OF [RREDUCIBLE NoN-NEGATIVE Matrices 63 


roots of D,~,(A) either are complex or are real and less than r. IIence 
D,~1(r) > 0 and this, in conjunction with (35), yields :"7 


" B(r) 
C(r) = 7 > OO. 
(r) Dy-1 (") (36) 
Remark 2. The inequality (35’) enables us to determine bounds for the 
maximal characteristic value r. 
We introduce the notation 


i= 2 Gx ((=1,2,...,"), s=mins,, S=maxs,. 
kel 1sign lsign 


Then: For an irreducible matriz A = O 
sSrss, (37) 


and the equality sign on the left or the right of r holds for s=S only; ve. 
holds only when all the ‘row-sums’ 51, S2,..., Sn are equal.?® 


For if we add to the Iast column of the characteristic determinant 


P—OB1y °° Aye . cee Ay, | 

‘" —@A, r—a eeo0e —ada 

Ae 21 22 2n 
| 


— Any —~Gng «++ T—Any 


all the preceding columis and then expand the determinant w. h respect to 
the elements of the last column, we obtain: 


eS (r——-s,) By (r)=-vu 


k= 


Hence (37) follows by (25). 


Remark 3. An irredueibic matrar A= O cannot have two lnearly tndc- 
pendent non-neyative characteristie vectors. For suppose that, apart from 
the positive characteristic \eetur z > o eorresponding to the maximal charae 
teristic value r, the matrix A has another characteristic vector y =o (1in- 
early independent of 2) for the characteristic value «a: 


17 Jn the following scetion it will be shown for an irreducible matrix B(A) > O, that 
C(A) > O for every real A= r. 


18 Narrower bounds for r than (s,S) are established in the papers [256], (295] and 
[119, IV]. 


64 XII]. Marrices wity Non-NEGATIVE ELEMENTS 


Ay=ay (yo; y20). 
Since r is a simple root of the characteristic equation | 4E — A |=0, 
axX~r. 
We denote by wu the positive characteristic vector of the transposed matrix 
A‘ fordj=r: 
A™u=ru (u>o0). 
Then?? 
r(y,u) =(y, A"u) = (Ay, u) =a (y,u); 


hence, as a =r, 
(y, uv) =0, 


and this is impossible for u > 0, y 2 0, y ¥o. 


Remark 4. In the proof of Frobenius’ Theorem we have established the 
following characterization of the maximal characteristic value r of an irre- 
ducible matrix A = O: 


r=maxr,, 
(z 20) 

where r, is the largest number o for which ex = Az. In other words, since 
r, = min cael we have 

= (Az), 

r=max min *“—. 
(c2oc)isisn ™% 

Similarly, we can define for every vector x = 0 (x 0) a number ** as th 
least number o for which 


1.e., we set 


If for some 1 we have here x,=—0, (Ax); 0, then we shall take r*= +0. 
As in the case of the function r,, it turns out here that the function r 
assumes a least value 7 for some vector v > o. 


Let us show that the number 7 defined by 


as , ; Az 
r=—minr*?—min max (A2)¢ (38 
(§20) §=(ez0)1stsn 7 
19 Tf y= (y1, Y2,..., Yn) and u = (tm, Ue, ..., Un), then we mean by (y, ~) the ‘scala 


5 | 
product’? y"u= SY ysus. Then (y, ATu) =yTATu and (Ay,u) = (Ay)TuxyTATu. 
=m] 


§ 2.. SPECTRAL PROPERTIES OF IRREDUCIBIIE NON-NEGATIVE MatTRIcEs 65 


20lncides with r and that the vector v = o (v 50) for which this minimum 
is assumed is a characteristic vector of A for A4=r. 
For, 
rvvu—Avzo (vzo0, vo). 


Suppose now that the sign = cannot be replaced by the equality sign. Then 
by Lemma 1 
(E+ A) (ry— Av) >0, (B+ A)*1v0> 0, (39) 
Setting 
w= (E+ A)*"v>0, 
we have 
ru> Au 

and so for sufficiently small e > 0 

(r—e) u>Au (u>o), 
which contradicts the definition of r. Thus 


Av= rv. 
But then _ a 
u=(E + A)*—!v =(14 1)», 


Therefore % >o implies that » > o. 
Hence, by the Remark 3, 


r=—fT. 


Thus we have for r the double characterization : 


: A 
pias. min: 2 


‘ A 
= min max Sad (40) 
(720) lstsn 4 (z2z0) lstsn 


ed | 


Moreover we have shown that max or cman. is Only assumed for a positive 
(z = 0) 2.20 


characteristic vector for 4=r. 
From this characterization of r we obtain the inequality”° 


n —“<r< max (Az), (x20, 20). (41) 
lsisn 7 lstgn i 


Remark 5. Since in (40) max and min are only assumed for a posi- 
(z 20) (z 2 0) 


tive characteristic vector of the irreducible matrix A = O, the inequalities 


20 See [128] and also [17], p. 325 ff. 


66 XT. Matrices with Non-Necative ELEMENTS 


rz Az, 220, eso 
or 
re= Az, z20,2o0 
always imply that 
Av=rz,2> 0. 


§ 3. Reducible Matrices 


1. The spectral properties of irreducible non-negative matrices that were 
established in the preceding section are not preserved when we go over to 
reducible matrices However, since every non-negative matrix A = O can 
be represented as the limit of a sequence oi ‘rreducible positive matrices A,, 


A=lim A, (A,>0, m=',2,...), (42) 
Mm oo 
some of the spectral properties of irreducible natrices huld in a weaker form 
for reducible matrices. 

For an arbitrary non-negative matrix 4 = / ay li we can prove the 
following theorem : 

THEOREM 3: A non-negatwe matric A='| ax if always has a non- 
negative characteristic value r such that the moduli of all the characteristic 
values of A do not exceed r. To thts ‘maximal’ characteristic value r there 
corresponds a non-negative charactertstec vector 


Ay=ry (y= 0, yo). 


The adjoint matrix B(A) = | By (A) \\i = (AE — A)—!A(A) satisfres the 
enequalrtres 


B20, = BU)ZO for Azr. (43) 
Proof. let A be represented as in (42). We denote by r™ and y™ 


the maximal characteristic value of the positive matrix A, and the corre- 
sponding normalized*’ positive characteristic vector : 


A, yl) = 6) of) (af) of) = 1, o™ > 0: m= 1,2, ...). ” (44) 
Then it follows from (42) that the hmit 


lim r™ =r 


21 By a normalized vector we mean a column ¥ = (4:1, Y2, ..., Yn) for which (y, y) = 


= is 
2 

> Y; == 1; 

f= 


§ 3. REDUCIBLE MaTRICES 67 


exists, where r is a characteristic value of A. From the fact that ro > 9 
and r(™) > | Ao’ , where A,(™ is an arbitrary characteristic value of A,, 
(m=1, 2,...), we obtain by proceeding to the limit: 


r=0,rZlal. 


where 4A, iS an arbitrary characteristic value of A. This passage to the limit 
gives us in place of (35) 


Bi(r) 20. (45) 
Furthermore, from the sequence of normal characteristic vectors y(™ 
(m=1, 2, ...) we can select a subsequence y'») (p=], 2, ...) that con- 


verges to some normalized (and therefore non-zero) vector y. | When we 20 


ane ae we abtai a: 
Ay=ry (y20, yo). 


The inequalities (43) will be established by induction on the order n. 
For n = 1, they are obvious.?* Let us establish them for a matrix A = 1) Ons Bes 
of order n on the assumption that they are true for matrices of order less 
than n 


n—Il 


4 (A) = (A => Ann) Bes (A) ~~ > B : i (A) BinOnk: (46) 


t, k=l 


Expanding the characteristic determinant A(4) =|AF —A| with re- 
spect to the elements of the last row and the last column, we obtain: Br’ (A). 
Here B,,, (4) =| 464, — a, |*—' is the characteristic daa of a ‘trun- 
cated’ non-negative matrix of order »—1, and BYA) is the algebraic 
complement of the element 46:4, — asin B,,(4) (14,4 =1,2,...,n—1). The 
maximal non-negative root of By,(A) will be denoted by r,. Then setting 
A=r, in (46) and observing that by the induction hypothesis 


By (20 (4, k=1,2,...,n—1), 


we obtain from (46): 
A(rn) = 0. 


On the other hand A(A) =A" +..., so that 4(+o0)=+x«. Therefore 
r, either is a root of 4(A) or is less than some real root of 4(A). In both 
cases, 


22 For since B(A) == (AE — A)—'4(A), we have B(A) =E B(A) == O forn==1. 


- 


68 XIII. MATRICES WITH NON-NEGATIVE ELEMENTS 
fn. 


Since every principal minor B,;(A) of order n —1 can be brought into 
the position of Byyn(A) by a permutation of A, we have 


rr (j=—1,2,..., 2), (47) 


where r; denotes the maximal root of the polynomial B,;(A) (j=1,2,..., 7). 

Furthermore, B,,(1) may be represented as a minor of order n —1 of 
the characteristic matrix AF — A, multiplied by (—1)‘+*. When we differ- 
entiate this determinant with respect to 4, we obtain: 


d . ; ; 

Bu A= SBA) (i, R= 1, 2,...,2—1), (48) 
where B” (4) =|! BY || (Aj, k Aj; 7=1, 2,..., ) is the adjoint matrix 
of the matrix | ax || (t,k=1, 2,...,7—1,j+1,..., n) of order n—1. 


But, by the induction hypothesis, 
BOA) 2O for AZzr; (j=1,2,...,n); 


and so, by (47) and (48), 
—_ B(A)=O for Azr. (49) 


From (45) and (49) it follows that 
B(A)2@O for. d4=r. 


The proof of the theorem is now complete. 


Not. In the passage’ to the limit (42) the inequalities (37) are pre- 
served. They hold, therefore, for an arbitrary non-negative matrix. How- 
ever, the conditions under which the equality sign holds in (37) are not 
valid for a reducible matrix. 


2. A number of important propositions follow from Theorem 3: 


1. Tf A=|| au It 1s a non-negative matriz with maximal characteristic 
value r and C(A) is its reduced adjount matrix, then 


C(A)2O for Azr. (50) 


§ 3. RepuciBLe Matrices 69 


For . 


_ BA 
CH=5—y (51) 


where D,_1(A) is the greatest common divisor of the elements of B(A). 
Since D,-1(A) divides the characteristic polynomial 4(4) and D,_1(A) = 
Av—-1 +. .., 

Da-1(A) >0 for A>r. (52) 


Now (43), (51), and (52) imply (50). 
2. If A= O its an irreducible matrix with mazimal characteristic value 


r, then 
B(A) > 0, 0(4)>0 for Azer. (58) 


Indeed, by (35) B(r) > O. But also (see (43) ) £B(A) = Oforsé=r. 
Therefore 
B(A)>O for A=r. (54) 


The other of the inequalities (53) follows from (51), (52), and (54). 


3. If A= O is an irreducible matriz with maximal characteristic value 
r, then 
(AE—A)-1>0 for A>r. (55) 


This inequality follows from the formula 


_; — BA) 
(AEB— A) =F 


since B(A) > O and A(A) > O for A>r. 


4. The maximal characteristic value r’ of every principal minor®® (of 
order less than n) of a non-negative matriz A= || au |? does not exceed 
the mazimal characteristic value r of A: 


vr. (56) 


If A ws wrreducible, then the equality sign in (56) cannot occur. 
If A is reducible, then the equality sign in (56) holds for at least one 
principal minor. 


Peete eae 
23 We mean here by a principal minor the matrix formed from the elements of a prin- 


cipal minor. 


70 XIII. Matrices wit Non-NEGATIVE ELEMENTS 


For the inequality (56) is true for every principal minor of order n — 1 
(see (47)). If A is irreducible. then by (35’) B.(r) > O (7=1.2...., 7) 
and therefore 7 ¥°. 

By descent from n—1 to n— 2, from n—2 to n— 3, etec., we show 
the truth of (56) for the principal minors of every order. e 

If A is a reducible matrix. ther by means of a permutation it can be 
put into the form 


Then r must be a characteristic value of one of the two principal minors B 
and D. This proves Proposition 4. 
From 4. we deduce: 


5. If A= O and tf in the characteristic determinant 


f—Qy, Are — Gy, 

Qo, T 2oo ° — Qo 
A()= 
“—~Gny ~~ Sng T— Any 


any principal minor vanishes (A is reducible!), then every ‘augmented’ 
minor also vanishes; in particular, so does one of the principal minors of 
order n—1 

By(4), Boo(d), ..., Ban). 


From 4. and 5. we deduce: 


6. A matrix A= O is reducible if and only if in one of the relations 
B,,(r) 2 9 (¢=1, 2, ..., ») 
the equality sign holds. 


From 4. we also deduce: 


7. If ris the maximal churacteristic value of a matriz A = O, then for 
every A>rall the principal nsnors of the characteristic matrix A, =AE —A 
are positiwe : 


4, te ghee t, . . ‘ 
A; Aa : >0 (A>r, 1st <tp<---<t, sn; p=—1,2,...,n). (57) 
1 2 ecoce Pp 


It is easy to see that, conversely, (57) implies that 4 > r. For 


§ 3. RepucisLe Matrices 71 


A+ p)=|Q+p)E— A= | Ar + 2B |= 2 8u, 


where S; 1s the sum of all the principal minors of order k of the character- 
istic matrix A, =AE—A (k=1, 2,...,%7).?* Therefore, if for some real 
4 alt the principal minors of A) are aosive: then for some un = 0 


A(A+ pw) 40, 
i.e., no number greater than A is a characteristic value of A. Therefore 
r<A. 


Thus, (57) is a necessary and sufficient condition for 4 to be an upper 
bound for the moduli of the characteristic values of A.*> However, the 
inequalities (57) are not all independent. 

The matrix AE — A is a matrix with non-positive elements outside the 
main diagonal.?® D. M. Kotelyanskii has proved that for such matrices, just 
as for symmetric matrices, all the puncpe. minors are positive, provided 
the successive principal minors are positive.”’ 

Lemma 3 (Kotelyanskil): If ina real matriz G= | 9ix ik all the non- 
diagonal elements are negatwe or zero 


9% =0 ((4k;1,kK=1,2,...,0) (58) 


and the successive principal minors are positiwe 


=a sh. Oe Ver alt 2 So sai 
ai (;)>° a POLE NE Beat ( 


then all the principal minors are positive : 


ty te oo. ty sae ; 
G : -)/>0 (lst, <tg< +++ <4, 57; p=1,2,...,). 


ty te eee % 


24 See Vol. I, p. 70. 

25 See [344]. 

26 Jt is easy to see that, conversely, every matrix with negative or zero non-diagonal 
elements can be represented in the form AE — 4, where 4 is a non-negative matrix and A 
is a real number. 

27 See [215]. This paper contains a number of results about matrices in which all the 
non-diagonal elements are of like sign. 


72 XII. Matrices wittt NON-NEGATIVE ELEMENTS 


Proof. We shall prove the lemma by induction on the order 7 of the 
matrix. For n= 2 the lemma holds, since it follows from 


929 goa S09, 911>9, 911922 —- 912921 > 9 


that goo > 0. Let us assume now that the lemma is true for matrices of 
order less than 7; we shall then prove it for G= '" giz he Wiosconeiadee the 
bordered determinants 


1 3 ; 
w=9(, A = 9119 — IixGir (t,k=2,...,m). 


From (58) and (59) it follows that 
ti. =O (1Akyt1,k=2,...,n). 


On the other hand, by applying Sylvester’s identity (Vol. I, Chapter IT, 
(30), p. 33) to the matrix T= || tu \/2 , we obtain: 


t, to «0. 8 
a(" i *) 
1 2 a P ; e e e e e oe 
Lt te. Sac i gure 


= ura, : p=),2,...,2—1 ). Sa 


ty 89 eee t 


Hence it follows by (59) that the successive principal minors of the matrix 
T = || tu |/2 are positive: 


t =2(5)>0 r() 3)>2 r() tee "\>0 
si 2 >“\2 3 eae sane | eee : 


Thus, the matrix T= | tix |o of order n —1 satisfies the condition of 
the lemma. Therefore by the induction hypothesis all the principal minors 
are positive: 


(; to eve ty 


3 J>o (2 Si, <ig c++ <i, Sn; p=1,2,...,0—]). 
4, ty 000% 


P 


But then it follows from (60) that all the principal minors of G containing 
the first row are positive: 


1%, t% ... 4, e e e 
Gi, '\>0 (25i,<i<-++ <i, Sn; p=1,2,...,n—1). (61) 
la to soe ty . 


§ 3. REDUCIBLE MATRICES 13 


Let us choose fixed indices 2), t2,..., m—-2 (wherel City Cu <c...< 
in —2 X=) and form the matrix of order n — 1: 


|| dag || (a, B=1, ty, te, - . +5 ty_2)- (62) 
The successive principal minors of this matrix are positive, by (61) : 
au>0.0() >On Ol, goo eos 
and the non-diagonal elements are non-positive : 
Jas = 9 (a B; a, B=1, t1, to, ...» S_o)- 


But the order of (62) is n—1. Therefore, by the induction hypothesis, all 
the principal minors of this matrix are positive ; in particular, 


a(” ae ‘*)>0 (63) 


4) te. by 
(254, <ig< +++ <t,Sn; p=1,2,..., m—2). 


Thus, all the minors of G of order not exceeding n — 2 are positive. 
Since by (63) goo > 0, we may now consider the determinants of order 
two bordering the element goo (and not gi: as before) : 


2% 


t= 1, k=1,3,..., ). 
ib a; : ( ee 


By operating with the matrix T* = || t%, ||, as we have done above with 7, 
we obtain inequalities analogous to (61): 


2%... 4, (64) 
(ty <Sig<iees <a; 4, ..., = 1,3, ....2; p=Hl,2,..., n—1). 
since every principal minor of G = | 9x {2 contains either the first or 


the second row or is of order not exceeding n — 2, it follows from (61), (63), 
and (64) that all the principal minors of A are positive. This completes 
the proof of the lemma. 

_ This lemma allows us to retain only the successive principal minors in 
the condition (57) and to formulate the following theorem : 


28 See [344] and [215]. Since C= A4--AE and A = O, Anis real (this follows from 
An + Az=r) and the corresponding characteristic vector of C is non-negative: Cy = Any 
(y Z0,y #0). 


14 XIII. Matrices witH Non-NEGATIVE ELEMENTS 


THEOREM 4: A real-number Ais greater than the maximal characteristic 
value r of the matriz A= | Oak ik =O 


rca 


it and only tf for this value A all the successive principal minors of the char- 
acteristic matrix A, = AE — A are positive: 


A—ay 12 --- a5, 
A—a —a —Aay A— a 
i Co | ee a O88 “28 10. (65) 
— @o) A — doe e e ee ef e e« 8 © © © @ © @ 
| ~~ Any — Ang o+- A — Onn ; 


Let us consider one application of Tyeorem 4. Suppose that in the matrix 


C= | Cx |\|f all the non-diagonal elements are non-negative. Then for some 
A>0O we have A=C+iAE=O. We arrange the characteristic values 4; 
(=1, 2,..., ”) of C with their real parts in ascending order: 


Re A, S Reda S...S Re dy. 


We denote by r the maximal characteristic value of A. Since the charac- 
teristic values of A are the sums A, + 4 (1=1, 2,..., 7), we have 


An tA=r. 


In this case tne inequality r < A holds for 4, < 0 only, and signifies that all 
the characteristic values of C have negative real parts. W-hen we write down 
the inequality (65) for the matrix — C = /AE — A, we obtain the following 
theorem : 


THEOREM 5: The real parts of all the characteristic values of a real 
matrixs C= | Cir ik with non-negative non-diagonal elements 


Cy =O (454k 3 4,k=1, 2,..., 0) 


are negative tf and only tf 


Cc © 
Cy), <0, mee >O0,..., (—1)" 


Cor aa 
Cat Ong - ++ Cnn 


§ 4. The Normal Form of a Reducible Matrix 


1, We consider an arbitrary reducible matrix A = || a, ||i. By means of a 
permutation we can put it into the form 


§ 4. Tae Norman Form or a Repucisue Matrix 75 


BO 


where B and D are square matrices. 
If one of the matrices B or D is reducible, then it can also be represented 
in a form similar to (67), so that A then assumes the form 


K O O 
A=|H LE O}. 
FGM 


If one of the matrices K, L, M is reducible, then the process can be con- 
tinued. Finally, by a suitable permutation we can reduce A to triangular 
block form 


Por oe | (68) 


y. gee’. arene. 
where the diagonal blocks are square irreducible matrices. 
A diagonal block Ay (1 S+Ss) is called tsolated if 
Ayg=O (k=1,2,...,4—1,¢4+1,..., 84). 


By a permutation of the blocks (see p. 50) in (68) we can put all the 
isolated blocks in the first places along the main diagonal, so that A then 
assumes the form 


A, O O O . 0 
O A; O O . O 

A={ 0O QO  ...4, O Oo |; (69) 
Agit1 Asit.2 ese Aptis Ayis O : 
Ay A.» eae A,, Ay, 541 ... A, 


here Ai, Ao,..., Ag are irreducible matrices, and in each row 
Ay; Av, wees Ay, py (f=gt+ 1, sais 8) 


at least one matrix is different from ‘zero. 
We shail call the matrix (69) the normal form of the reducible matrix A. 


76 XIII. Marrices witH Non-NEGATIVE ELEMENTS 


Let us show that the normal form of a matrix A is uniquely determined 
to within a permutation of the blocks and permutations within the diagonal 
blocks (the same for rows and columns).?® For this purpose we consider 
the operator A corresponding to A in an n-dimensional vector space R. To 
the representation of A in the form (69) there corresponds a decomposition 
of R into coordinate subspaces 

R=R,+R.+...+R,t+Ryiit...+R,; (70) 


here R,,R,_, + R,, R,_2+R,_1 + Rz,... are invariant coordinate subspaces 
for A, and there is no intermediate invariant subspace between any two 
adjacent ones in this sequence. 

Suppose then that apart from the normal form (69) of the given matrix 
there is another normal form corresponding to another decomposition of R 
into coordinate subspaces: 

™N “N ”N ~~ Ns 
R=R,+ Ro+...+R,+ Rui t...t+ R. (71) 


The uniqueness of the normal form will be proved if we can show that the 
decompositions (70) and (71) coincide apart from the order of the terms. 


Suppose that the invariant subspace R, has coordinate vectors in com- 
mon with R,, but not with R,.,,..., R,. Then R, must be entirely con- 
tained in Rx, since otherwise R, would contain a ‘smaller’ invariant sub- 
space, the intersection of R, with R,; + Ry4it+...+R,. Moreover, R, must 


“N 
coincide with R,, since otherwise the invariant subspace R;+R,41:+...+R, 
would be intermediate between Ry, + Ri4i1+...+R,and Rigi t...+Rs. 


oN 
Since R,; coincides with R,, R, is an invariant subspace. Therefore, without 
infringing the normal form of the matrix, R, can be put in the place of R,. 


Thus, we may assume that in (70) and (71) R, = R,. 
“N 


Let us now consider the coordinate subspace R;_,. Suppose that it has 
coordinate vectors in common with R, (1 < s), but not with R,4;,..., Rg. 


“~N mNN 
Then the invariant subspace R;_, +R; must be entirely contained in 
R,+ Rii,+...+R,, since otherwise there would be an invariant coordinate 


™N oN NN mn 
subspace intermediate between R,; and R;_,+R,;. Therefore R;.1CR, 
NN “a 
Moreover R;_,; =R,, since otherwise R;_,; + Ri,; +... +R,would be an 
invariant subspace intermediate between R,+ R.4,+...+R, and R,4,+ 
29 Without violating the normal form we can permute the first g blocks arbitrarily 


among each other. Moreover, sometimes certain permutations among the last s — g blocks 
are possible with preservation of the normal form. 


§ 4. Tae Norma. Form oF A REDUCIBLE Matrix q7 


_..+R,. From R._, == R, it follows that R, + R, is an invariant subspace. 
Therefore R, may be put in the place of R,_, and then we have 


“N aN 
R,_1 = R,_1 R, = R,. 


Continuing this process, we finally reach the conclusion that s=# and 
that the decompositions (70) and (71) coincide apart from the order of the 
terms. The corresponding normal forms then coincide to within a permuta- 
tion of the blocks. 

From the uniqueness of the normal form it follows that the numbers 
g and s are invariants of the non-negative matrix A.*° 
2. Making use of the normal form, we shall now prove the following 
theorem : 

THEOREM 6: To the maximal characteristic value r of the matrix A =O 
there belongs a positive characteristic vector if and only if in the normal 
form (69) of A: 1) each of the matrices A;, Ao,..., Ag has r as a charac- 
teristic value ; and (in case g < s) 2) none of the matrices Agi1,..., Ag has 
this property. 

Proof. 1. Let z > 0 bea positive characteristic vector belonging to the 
maximal characteristic value r. In accordance with the dissection into 
blocks in (69) we dissect the column z into parts z* (k=1, 2,...,s). Then 
the equation 

Az=rz (z2>0) (72) 


is replaced by two systems of equations 
Ag=rze (i=1,2,..., 9), (72’) 


j—1 
& Ane + Ap =r (j=g+1,..., 8). (72) 


From (72’) it follows that 7 is a characteristic value of each of the 
matrices A,, A2,..., Ay. From (72”) we find: 


Adsrd, AdArd (j=g+l,...,a). (78) 
We denote by r; the maximal] characteristic value of A; (7=g+1,...,8). 
Then (see (41) on p. 65) we find from (73) : 
ymax “ol <r (Gj=g+l1,..., 8). 


$ 


Papeete eat 
30 For an irreducible matrix, g=s=1. 


78 XIII. Marrices witn Non-NEGATIVE ELEMENTS 


On the other hand, the equation r;—=r would contradict the second of the 
relations (73) (see Note 5 on p. 65). Therefore 


ir<r (j=gtl,..., 9). (74) 


2. Suppose now, conversely, that the maximal characteristic values of 
the matrices A; (1=1, 2,..., g) are equal to r and that (74) holds for the 
matrices A; (j;=gQ+1,..., 58). Then by replacing the required equation 
(72) by the systems (72’), (72) we can define positive characteristic col- 
umns 2’ of the matrices A; (1=1, 2,..., g) by means of (72’). Next we 
find columns 2 (j=g+1,...,8) from (72): 


I= (1B) — AIS! Age (j=g+t+1,..., 8), ' (76) 
where H; is the unit matrix of the same order as A, ‘g=g+1...., s) 
Since 7; << r (j=9 +1,..., 58), we have (see (55) on p. 69) 
(rE; — A) > 0 (j=g+1,..., 8). (76) 
Let us prove by induction that the columns 29*', ..., 2 defined by (75) 


are positive. We shall show that for every 7 (g +1) 35s) the fact that 
zi, 27,..., 2/7! are positive implies that z/ > 0. Indeed, in this case, 


ji fl 
pa Ayz2o, ay Ape Ho, 
hal A==l 
which in conjunction with (76) yields, by (75) : 
2> 0. 
Thus, the positive column z= (2, ..., 25) is a characteristic vector of A 


for the characteristic value r. This completes the proof of the theorem. 


3. The following theorem gives a characterization of a matrix A = O 
which together with its transpose A™ has the property that a positive char- 
acteristic vector belongs to the maximal characteristic value. 


THEOREM 7:°' To the maximal characteristic value r of a matriz,A = O 
there belongs a positive characteristic vector both of A and of A‘ if and only 
if A can be represented by a permutation in quasi-diagonal form 


A={A,, 4, ..-; A}; (77) 


where Ai, A2,..., As are irreducible matrices each of which has r as its 
maximal characteristic value. 


81 See [166]. 


§ 4. THE Norman Form or a REDUCIBLE Matrix 79 


Proof. Suppose that A and A’ have positive characteristic vectors for 
A='r. Then, by Theorem 6, A is representable in the normal form (69), 
where Aj, Ao,..., A, have r as maximal characteristic value and (for g < s) 
the maximal characteristic values of A,41,..., A, are less thanr. Then 


’ AY ss O Alga oS: An 
ie O: n2 AT AN oor JAY 

0... O AL, 
Oo... 00 wie aA 


Let us reverse here the order of the blocks in this matrix: 


nn?) 0...0 
AT,, At, O...0 


(78) 


An Aj-ij10 «-- Aj 


Since A), A!_,,...,A, are irreducible, we obtain a normal form for (78) 
by a permutation of the blocks, placing the isolated blocks first along the 
main diagonal. One of these isolated blocks is A; . Since the normal form 
of A’ must satisfy the conditions of the preceding theorem, the maximal 
characteristic value of A; must be equal to r. This is only possible when 
g=s. But then the normal form (69) goes over into (77). 

If, conversely, a representation (77) of A is given, the), 


AT ={ AT, Al, ..., AT}. (79) 


We then deduce from (77) and (79), by the preceding theorem, that A and 
A‘ have positive characteristic vectors for the maximal characteristic value r. 
This proves the theorem. 


Corotuary. If the maximal characteristic value r of a matrix AZO 
is simple and tf positive characteristic vectors belong to r both in A and A’, 
then A ts trreducible. 


80 XIII. Marrices witn Non-NEGATIVE ELEMENTS 


Since, conversely, every irreducible matrix has the properties of this 
corollary, these properties provide a spectral characterization of an irre- 
ducible non-negative matrix. 


§ 5. Primitive and Imprimitive Matrices 


1. We begin with a classification of irreducible matrices. 


DEFINITION 3: Jf an irreducible matrix A =O has h characteristic 
values 1, do, ..., An of maximal modulus r (A; =| Ag |=...= | Ar | =r), 
then A is called primitive 1f h=1 and wemprimitive if h > 1. h is called the 
index of imprimitivity of A. 

The index of imprimitivity h is easily determined if the coefficients of 
the characteristic equation of the matrix are known 


A (A)= 2" + aA"? + aA + +++ + 4,4% =0 
(N> NM >>; a0, 2,50, ..., a, 0) 5 


namely : h 1s the greatest common divisor of the differences 
na—Ny, Ny — Noy, eeey N,_1—N,. (80) 


For by Frobenius’ theorem the spectrum of A in the complex A-plane 
goes over into itself under a rotation through 22/h around the point 4 =0. 
Therefore the polynomial 4(A) must be obtained from some polynomial 
g(“#) by the formula 


A (A)= g(a") a”. 


Hence it follows that h is a common divisor of the differences (80). But 
then h is the greatest common divisor d of these differences, since the spec- 
trum does not change under a rotation by 2z/d, which is impossible for h < d. 
The following theorem establishes an important property of a primitive 
matrix : 
THEOREM 8: A matric A = O is primitive if and only if some power of. 
A ts positive: 
A*>O (p21). (81) 


Proof. If A? >O, then A is irreducible, since the reducibility of A 
would imply that of A’. Moreover, for A we have h=1, since otherwise 
the positive matrix A? would have h (> 1) characteristic values 


42, AP, . 2. AP 


of maximal modulus r?, and this contradicts Perron’s theorem. 


§ 5. Primitive AND IMPRIMITIVE M ATRICES 81 
Suppose now, conversely, that A ‘ oe. 
23) of Chapter V (Vol. I, p. 107) * 


_—- mle |, (m~—1) 


We apply the formula 


(82) 
‘where 
y (A) == (A—Ag)™ (A— Ag (AA) (AKA, for 754 f) 


, me a J AG), 
is the minimal polynomial of A, (A) = (7 i,jm (4 =1, 2,..., 8) and C(A) 


= (AE — A)-y(A) is.the reduced adjoint matrix. 
In this case, we can set: 


A= r>|a,|2-°°2/4,| and m,=1. (83) 


Then (82) assumes the form 


C(r) 1 C (a) dey ome-1) 
A? = y'(r) alas + 2m—mi[ a) | 


AcmAp 


Hence it is easy to deduce by (83) that 


Ap C(r) 


pow 7% y'(r)’ = 


On the other hand, C(r) > O (see (53) ) and y’(r) > O by (83). There- 
fore 


and so (73) must hold from some p onwards.*?. This completes the proof. 
We shall now prove the following theorem: 


THEOREM 9: If A =O is an irreducible matrix and some power A4 of 
A ws reducible, then A 1s completely reducible, i.e., A? can be represented 
by means of a permutation in the form 


Al ={A,, Ag, ..., Ag}; (85) 


where A, Ao, ..., Ag are irreducible matrices having one and. the same 
maximal characteristic value. Here ds the greatest common divisor of g 
and h, where h ts the index of imprimitivity of A. 


32 Ag regards a lower bound for the exponent p in (81), see [384]. 


82 XII. Matrices with NON-NEGATIVE ELEMENTS 


Proof. Since A is irreducible, we know by Frobenius’ theorem that 
positive characteristic vectors belong to the maximal characteristic value r, 
both in A andin A‘™. But then these positive vectors are also characteristic 
vectors of the non-negative matrices A? and (.19)° for the characteristic 
value A4=r?. Therefore by applying Theorem 7 to .4%, we represent this 
matrix (after a suitable permutation ) in the form (65), where Aj, Ao,..., Aa 
are irreducible matrices with the same maximal characteristic value r%. 
But A has h characteristic values of maxima) modulus r: 


f 2ni\ 
r,Té,...,7r@-1 ise" |, 
Therefore A? also has h characteristie values of maximal modulus 


rfl, ret, 6, TETAHD | 

among which d are equal to r?._ This is only possible when d is the greatest 
common divisor of gand h. This proves the theorem. 

For h=1, we obtain: 

CoRoLLaRy 1: A power of a primitive matrir is irreducible and primi- 
tive. 

If we set g=A in the theorem, then we obtain: 

Corouuary 2: If Atsanimprimitive matric with indec of imprimitivity 
h, then A” splits into h primitive matrices with the same maximal charac- 
teristic value. 


§ 6. Stochastic Matrices 


1. We consider 7 possible states of a certain system 


Si, So, eooey Sa (86) 
and a sequence of instants 
to, 1, te, .- 


Suppose that at each of these instants the system is in one and only one 
of the states (86) and that p, denotes the probability of finding the system 
in the state S; at the instant ¢, if it is known that at the preceding instant 
t;-1 the system is in the state 8S; (1, j7=1,2,...,n;k=1, 2,...). We shall 
assume that the transition probability pi; (%, j= = 2. .,”) does not depend 
on the index & (of the instant ¢;). 

If the matrix of transition probabilities is given, 


P= || pu | 


§ 6. StocHastic MatricEs 83 


then we say that we have a homogeneous Markov chain with a finite number 
of states.*> It is obvious that 


n 
py = 0, 2 Pi =1 (i, 7=1, 2,..., 0). (87) 
]= 
"DEFINITION 4: A square matrix P= | Vij | is called stochastic if P is 


non-negative and if the sum of the elements of each row of P 1s 1, 1.e., of the 
relations (87) hold.** 


Thus, for every homogeneous Markov chain the matrix of transition 
probabilities is stochastic and, conversely, every stochastic matrix can be 
regarded as the matrix of transition probabilities of some homogeneous 
Markov chain. This is the basis of the matrix method of investigating homo- 
geneous Markov chains.*5 

A stochastic matrix is a special form of a non-negative matrix. There- 
fore all the concepts and propositions of the preceding sections are applicable 
to it. 

We mention some specific properties of a stochastic matrix. From the 
definition of a stochastic matrix it follows that it has the characteristic value 
1 with the positive characteristic vector z= (1, 1,..., 1). It is easy to see 
that, conversely, every matrix P =O having the characteristic vector 
(1,1,...,1) for the characteristic value 1 is stochastic. Moreover, 1 is the 
maximal characteristic value of a stochastic matrix, since the maximal char- 
acteristic value is always included between the largest and the smallest of 
the row sums** and in a stochastic matrix all the row sums are 1. Thus, 
we have proved the proposition : 


1. A non-negatwe matrir P = O 18 stochastic if and only tf it has the 
characteristic vector (1, 1, ..., 1) for the characteristic value 1. For a 
stochastic matriz the maximal characteristic value ts 1. 


Now let 4 = | Ox, || be a non-negative matrix with a positive maximal 
characteristic value r > 0 and a corresponding positive characteristic vector 


33 See [212] and [46], pp. 9-12. 
| 
34 Sometimes the additional condition Pi #0 (j=1, 2,..., mn) is ineluded in the 


¢ ow] 
definition of a stochastic matrix. See [46], p. 13. 


35 The theory of homogeneous Markov chains with a finite (and a countable) number 
of states was introduced by Kolmogorov (see [212]). The reader can find an account 
of the later introduction and development of the matrix method with applications to 
homogeneous Markov chains in the memoir [329] and in the monograph [46] by V. I. 
Romanovskii (see also [4], Appendix 5). 


8@ See (37) and the note on p. 68. 


84 XIII. Matrices with Non-NEGATIVE ELEMENTS 


a 
A a2 = 17% (t = 1, 2, eoey n) : (88) 
We introduce the diagonal matrix Z = {z, 22, ..., 22} and the matrix 
P=|| py || 
P=—Z4AZ. 
Then : 
Py=— Gl ay2z,20 (4,7 =1,2,..., 2), 
and by (88) 
> p3=1 (¢=1,2,...,2). 
j=l 
Thus: 


2. A non-negative matriz A = O with the maximal positive characteristic 
value r>90 and with a corresponding positive characteristic vector 
2 = (21,22,..- 2n) > 018 similar to the product of r and a stochastic matriz :°" 


A=ZrPZ 1 (Z={z, 2g, .-+, 2} > 0). (89) 


In a preceding section we have given (see Theorem 6, § 4) a characteriza- 
tion of the class of non-negative matrices having a positive characteristic 
vector for A=r. The formula (89) establishes a close connection between 
this class and the class of stochastic matrices. 


2. We shall now prove the following theorem : 
THEOREM 10: To the characteristic value 1 of a stochastic matrix there 
always correspond only elementary divisors of the first degree. 
Proof. We apply the decomposition (69), § 4, to the stochastic matrix \ 
= | Py ik 
A, a 


OO Wen bh eka eee eee 
P=| oO A, O o |, 
Bie 3 5 tg A ) 

Bei ok wi Ag . A, 


where A;, Az,..., As are irreducible and 


37 Proposition 2. also holds for r==0, since 4 = O, ¢ > 0 implies that 4 = 0. 


§ 6. StocHastic MaTRICES 85 


Ay, + Aya +++++ Az yO ({=gt+1,...,8). 


Here A1, Ao, ..., Ag are stochastic matrices, so that each has the simple 
characteristic value 1. As regards the remaining irreducible matrices 
Ag1,---, As, by the Remark 2 on p. 63 their maximal characteristic values 
are less than 1, since in each of these matrices at least one row sum is less 
than 1.58 

Thus, the matrix P is representable in the form 


p=(* 4) 
S Q. 


where in Q; to the value 1 there correspond elementary divisors of the first 
degree, and where 1 is not a characteristic value of Qo. The theorem now 
follows immediately from the following lemma: 


Lemma 4: If a matric A has the form 


a=(% O 


where Q, and Qe are square matrices, and if the characteristic value A¢ of A 
as also a characteristic value of Qi, but not of Qo, © 
(QO, —A, #|=0, \Q2— of |~0, 


then the elementary dwisors of A and Q; corresponding to the characteristic 
value A, are the same. 


Proof. 1. To begin with, we consider the case where Q; and Q2 do not 
have characteristic values in common. Let us show that in this case the 
elementary divisors of Qi and Q2 together form the system of elementary 
divisors of A, i.e., for some matrix T (| T | #0) 


TAT =| *? e 91 
& f ) 


We shall look for the matrix T in the form 
"= & O 
U #£, 


38 These properties of the matrices 4:,..., A. also follow from Theorem 6. 


86 XIII. Marrices with Non-NecaTIvVE ELEMENTS 


(the dissection of T into blocks corresponds to that of A; FE, and E,2 are unit 
matrices). Then 


(8 ELE Sleloa ieee a) 
rara=(_ E, S QV- —U E. UQ,—@,U+8 Qs 


The equation (91’) reduces to (91) if we choose the rectangular matrix 
U so that it satisfies the matrix equation 


Q,U —UQ,=8. 


If Q; and Qz have no characteristic values in common, then this equation 
always has a unique solution for every right-hand side S (see Vol. I, Chapter 
VIII, § 3). 

2. In the case where Q, and Q» have’ characteristic values in common, 
we replace Q; in (90) by its Jordan form J (as a result, A is replaced by a 
similar matrix). Let J={J;J2}, where all the Jordan blocks with the 
characteristic value 4, are combined in J,. Then 


a J, O OO 
O Js O O O : siedasnnscetauts 

A= 0 OV _fe 
Sy, Sia Q Su : Qs 
So; Ses : 3 Sox : 


“N 

This matrix falls under the preceding case, since the matrices J; and Q2 have 
no characteristic values in common. Hence it follows that the elementary 
divisors of the form (A — A,)? are the same for A and J, and therefore also 
for A and Q;. This proves the lemma. 

If an irreducible stochastic matrix P has a complex characteristic value 
Ay with | 4) | =1, then 4,P is similar to P (see (16)) and so it follows from 
Theorem 10 that to A, there correspond only elementary divisors of the first 
degree. With the help of the normal form and of Lemma 4 it is easy to 
extend this statement to reducible stochastic matrices. Thus we obtain: 


Corotuary 1. If A, is a characteristic value of a stochastic matriz P and 
|A, |= 1, then the elementary divisors corresponding to A, are of the first 
degree. 


From Theorem 10 we also deduce by 2. (p. 84): 


Corouuary 2. If a positive characteristic vector belongs to the maximal 
characteristic value r of a non-negative matrix A, then all the elementary 
divisors of A that belong to a characteristic value Ao with | A, | =r are of the 
forst degree. 


§ 7. LIMITING PROBABILITIES FOR MarKov Ciiain 87 


We shall now mention some papers that deal with the distribution of the 
characteristic values of stochastic matrices. 

A characteristic value of a stochastic matrix P always hes in the disc 

{A | 1 of the A-plane. The set of all points of this dise that are character- 
istic values of any stochastic matrices of order » will be denoted by M,. 
3. In 1938, in connection with investigation on Markov chains A. N. Kol- 
mogorov raised the problem of determining the structure of the domain M,,. 
This problem was partially solved in 1945 by N. A. Dmitriev and E. B. 
Dynkin [133], [133a] and completely in 1951 in a paper by F. I. Karpelevich 
[209]. It turned out that the boundary of M, consists of a finite number 
of points on the circle |A/=1 and certain curvilinear ares joining these 
points in cyclic order. 

We note that by Proposition 2. (p. 84) the characteristic values of the 
matrices A= | a ||{ 2 O having a positive characteristic vector for A=r 
with a fixed r form the set r*M,.°° Since every matrix A= |j dix 1 2O 
can be regarded as the limit of a sequence of non-negative matrices of that 
type and the set r° VM, is closed, the characteristic values of arbitrary matrices 
A= | diz 7 => O with a given maximal characteristic value r fill out the 
set r° M,,.*° 

A paper by H. R. Suleimanova [359] is relevant in this context; it con- 
tains sufficiency criteria for n given real numbers 4;, do, ..., An to be the 
characteristic values of a stochastic matrix P= || pi; ik 


§ 7. Limiting Probabilities for a Homogeneous Markov Chain 
with a Finite Number of States 


I. Let 
81, So, ..., Sn 


be alli the possible states of a system in a homogeneous Markov chain and let 
P= | Dij ||? be the stochastic matrix determined by this chain that is formed 
from the transition probabilities p,,; (2,7 == 1, 2,..., 2) (see p. 82). 

We denote by py the probability of finding the system in the state S, at 
the instant ¢, if it is known that at the instant ¢,_, it is in the state 8, 


(i, j=1, 2,...,n;q=1, 2,...). Clearly, p= py (i, j= 1, 2,..., n). 


39 re Mp is the set of points in the A-plane of the form ru, where uw € Mn. 

40 Kolmogorov has shown (see [133a (1946)], Appendix) that this problem for an 
arbitrary matrix 4 = O can be reduced to the analogous problem for a stochastic matrix. 

#1 See also [312]. 


88 AIJI. Matrices with Non-NEGATIVE ELEMENTS 


Making use of the theorems on the addition and multiplication of probabili- 
ties, we find easily: 


fl 
1 s s 
py’ = PE Pry (t,7 =1, 2,...,%) 


or, in matrix notation, 


1 
oer | =p IE pelt - 


Hence, by giving to g in succession the values 1, 2,... , we obtain the impor- 
tant formula‘? 
) 
ip? | =P" (g=1,2,...). 
If the limits 


yo 


lira p= pe (i, 7=1, 2, ..., 2) 
q-» co 


or, in matrix notation, 

lim Pr=P" =|[py ih 
exist, then the values Py (1,7=1, 2,..., n) are called the limiting or final 
transition probabilities.*? 

In order to investigate under what conditions limiting transition proba- 
bilities exist and to derive the corresponding formulas, we introduce the fol- 
lowing terminology. 

We shall call a stochastic matrix P and the corresponding homogeneous 
Markov chain regular if P has no characteristic values of modulus 1 other 
than 1 itself and fully regular if, in addition, 1 is a simple root of the 
characteristic equation of P. 

A regular matrix P is characterized bv the fact that in its normal form 
(69) (p. 75) the matrices A;, Ao,..., 4, are primitive. For a fully regular 
matrix we have, in addition, g = 1. 

Furthermore, a homogeneous Markov chain is trreducible, reducible, 
acyclic or cyclic if the stochastic matrix P of the chain is irreducible, reduc- 
ible, primitive, or imprimitive, respectively. Just as a primitive stochastic 
matrix is a special form of a regular matrix, so an acyclic Markov chain is a 
special form of.a regular chain. © 

We shall prove that: Limiting transition probabilities exist for regular 
homogeneous Markov chains only. 


42 It follows from this formula that the probabilities p) as well as pus (i, j==1, 2, 
8. cag ee gee 1h 20s .) do not depend on the index k of the origina) instant tx. 


43 The matrix P@, as,the limit of stochastic matrices, is itself stochastic. 


§ 7. Lrmitinc Prosasiuities ror Markov CaaINn 89 


For let y(4) be the minimal polynomial of the regular matrix P= || py ||1- 
Then 


y (A) =(A— Am (A— Ag) (A AY™ (AAAs 1 =I, 2,..., u). (92) 
By Theorem 10 we may assume that 
A=1, m,=1. (93) 


By the formula (23) of Chapter V (Vol. I, p. 107), 


C (2) = .. 4 O(a) 4g} M-D 
Pr=—-+ 3 —, Mt (94) 
¥ (1) am ” 3 (2) hon 


where C(A) = (AE —/ )—1 (A) is the reduced adjoint matrix and 


b (4) = eae (k=1,2,..., ); 


moreover 


ypa=2O and y=). 


If P is a regular matrix, then 
|Ar| <1 (k= 2,3,..., 4), 


and therefore all the terms on the right-hand side of (94), except the first, 
tend to zero for g—> «. Therefore, for a regular matrix P the matrix P” 
formed from the limiting transition probabilities exists, and 


a ety. (95) 


The converse proposition is obvious. If the limit 


P*= lim P* (98) 

q-eoco 
exists, then the matrix P cannot have any characteristic value A, for which 
Ay 1 and | A, |=1, since then the limit lim 47 would not exist. (This 


g-> 00 
limit must exist, since the limit (96) exists.) 
We have proved that the matrix P* exists for a regular homogeneous 
Markov chain (and for such‘a regular chain only). This matrix is deter- 
mined by (95). 


90 XTIIT. Matric&ts with NON-NEGATIVE ELEMENTS 


We shall now show that P® can be expressed by the characteristic poly- 
nomial 


A (A) = (A—,)™ (A— Ag)" +++ (A— Ay) (97) 


and the adjoint matrix B(A) = (AF — P)—1A(A). 
From the identity 
Bay C(a) 
A(a) ~~ y (A) 


it follows by (92), (93), and (97) that 


n,Beu—1) (1) __ C (1) 
Awa) (1) y’()" 


Therefore (95) may be replaced by the formula 


n, Bi—1) (1) 


Po = Tea () 


(98) 
For a fully regular Markov chain, inasmuch as it is a special form of a 
regular chain, the matrix P* exists and is determined by (95) or (98). In 
this case n; = 1, and (98) assumes the form 
B 


wo. B(l) 
Pp? = A (iy’ (99) 


2. Let us consider a regular chain of general type (not fully regular). We 
write the corresponding matrix P in the normal form 


a a; 0 
oO. Q; O O 
P=F Ug «+ Opti Opts a © (100), 


Oyun ee a8 ig a nd i em, "IQ: 


§ 7. LimrtiINa PRoBABILITIES FOR Markov CHAIN 91 


where @,,..., Q, are primitive stochastic matrices and the maximal values 
of the irreducible matrices Q,,1,..., Q, are less than 1. Setting 


Oo41, 1 . ° e O49 Qo+1 ° e e O 


Oe S84 Diets x a: 


we write P in the form 


P= a 
0...Q Oo 
U W 
Then 
OH .-O O 
io eo (101) 
O bas Q O 
¢ q 
eocat U' W 
Oe... O O 
P= lim P? = 
pk , S24 : 
OF ek Q™= O 
U2. : 
But W*? =lim W¢%= O, because all the characteristic values of W are of 
g-> co 
modulus less than 1. Therefore 
g-...0 O 
pe=[ °°: = As (102) 
O ...Q O 
U.. O 
Since Q:,..., Qg are primitive stochastic matrices, the matrices Q@, ..., 


Q; are positive. by (99) and (35) (p. 62) - 


92 XIII. MatTRICES WITH NON-NEGATIVE ELEMENTS 


Qe>0,...,9r>0, 


and in each of these matrices all the elements belonging to any one column 
are equal: 


r= (O51 @=1, 2 -..,9)- 


We note that the states S;, So,..., S, of the system fall into groups cor- 
responding to the normal form (100) of P: 


Bi ics wii Dink agenda (103) 


To each group 2 in (103) there corresponds a group of rows in (100). 
In the terminology of Kolmogorov the states of the system that occur in 
2}, 2£9,---+, 2, are called essential and the states that occur in the remaining 
GTOUPS 2Xy13,---, ds non-essential. 

From the form (101) of P? it follows that in any finite number gq of steps 
(from the instant t,_, to t,) only the following transitions of the system are 
possible: a) from an essential state to an essential state of the same group; 
b) from a non-essential state to an essential state ; and c) from a non-essential 
state to a non-essential state of the same or a preceding group. 

From the form (102) of P® it follows that: A limiting transition can 
only lead from an arbitrary state to un essential state, i.e:, the probability 
of transition to any non-essential state tends to zero when the number of 
steps q tends to infinity. The essential states are therefore sometimes also 
called lamiting states. 


3. From (95) it follows that* 
(EH —P)P°=0. 


Hence it is clear that: Every column of P® 1s a characteristic vector of the 
stochastic matrix P for the characteristic value A= 1. 

For a fully regular matrix P, 1 is asimple root of the characteristic equa- 
tion and (apart from scalar factors) only one characteristic vector (1. 1,...,1) 
of P belongs to it. Therefore all the elements of the 7-th column of P* 
are equal to one and the same non-negative number p,': 


n 
pe =p420 (j=1,2,..., 0; D3) pj =). (104) 


7 j=1 


44 See [212] and [46], pp. 37-39. 


45 This formula holds for an arbit ary regular chain and can be obtained from the 
obvious equation P?7 — P* PI-1= O b; passing to the limit q> oo. 


§ 7. LimiTING PROBABILITIES FOR Markov CHAIN 93 


Thus, in a fully regular chain the limiting transition probabilities do 
not depend on the initial state. 

Conversely, if in a regular homogeneous Markov chain the limiting 
transition probabilities do not depend on the initial state, i.¢., if (104) holds, 
then obviously in the scheme (102) for P* we have g=1. But then n,; =1 
and the chain is fuliy regular. 

For an acyclic chain, which is a special case of a fully regular chain, 
P is a primitive matrix. Therefore P? > O (see Theorem 8 on p. 80) for 
some q>0. But then also P® = P* P21 > 0,*° 

Conversely, it follows from P* > O that P? > O for some g > 0, and 
this means by Theorem 8 that P is primitive and hence that the given homo- 
geneous Markov chain is acyclic. 


We formulate these results in the following theorem: 

THEOREM 11: 1. Inahomogeneous Markov chain all the limiting transt- 
tion probabilities exist if and only if the chain is regular. In that case the 
matrix P@ formed from the limiting transition probabilities ts determined 
by (95) or (98). 

2. Inaregular homogeneous Markov chan the limiting transition vroba- 
bilities are independent of the wnitral state if and only if the chain is fully 
regular. In that case the matrix P® 1s determined by (99). 

3. Ina regular homogeneous Markov chan all the limiting transition 
probabilities are different from zero if and only tf the chain ts acyctic.*" 


4. We now consider the columns of absolute probabilities 


k k Ck k 
p= (py, Pe» oweg Pa) (k= 0, I, 2, o° 3); (105) 


where Di is the probability of finding the system in the state 8; («= 1, 2,..., 
n; k=0, 1, 2; ...) at the instant t,. Making use of the theorems on the 
addition and multiplication of probabilities, we find: 


k ” 0 : 
Py= 2 PaPrs (i=1, 2,...,n;k=1,2,...) 


or, in matrix notation, 


46 This matrix equation is obtained hy passing to the limit m—> > from the equation 
Ppm= pm-ad. PY (m>q). P*& is a stochastic matrix; therefore P* = O and there are 
non-zero elements in every row of P®. Hence P% P¢ > 0. Instead of Theorem 8 we can 
use here the formula (99) and the inequality (35) (p. 62). 

47 Note that P* > O implies that the chain is acyclic and therefore regular. Hence it 


follows automatically from P° >O that the limiting transition probabilities do not 
depend on the initial state, i.e., that the formulas (104) hold. 


9 
4 XIII. Marrices wits Non-NEGATIVE ELEMENTS 
0 
p=(P'¥p (k=1,2,...), (106) 


Where PT ig the transpose of P. 
All the absolute probabilities (105) can be determined from (106) if the 


Mitial probabilities Pr, Do, ee Dn and the matrix of transition probabilities 
= || pis ||t are known. 
We introduce the limiting absolute probabilities 


p,=lim p,; (,=1, 2,..., 2) 
or k => oo 


eo co co co : k 
P= (Py, Pe, --+> Pn) = lim p. 
When we take the limit k > « on both sides of (106), we obtain: 


p=(P~)"p. (107) 


Note that the existence of the matrix of limiting transition probabilities 


Pe implies the existence of the limiting absolute probabilities 


of 


P = (Pi, Par +++» Dn) 


for arbitrary initial probabilities p= (pr, Do, ...,) Pn), and vice versa. 
From the formula (107) and the form (102) of P® it follows that: The 
limiting absolute probabilities corresponding to non-essential states are zero. 


Multiplying both sides of the matrix equation 
PT .(P=)" + (Poy 
by p on the right, we obtain by (107): 


i.e.: The column of limiting absolute probabilities p is a characteristic vector 
of P™ for the characteristic value A=1. 

Tf a fully regular Markov chain is given, then A=:1 is a simple root of 
the characteristic equation of P*. In this case, the column of limiting abso- 


lute probabilities is uniquely determined by (108) (because p, =0 (j= 
® oo 
1,2,...,”) and 2 p, =1). 
j= 


§ 7.. LIMITING Propasinities FOR Markov CHAIN 95 


Suppose that a fully regular Markov chain is given. Then it follows 
from (104) and (107) that: 


r- - bid 0 ‘ees on n 0 : oe sf oot 
P ee PhPhj = Pi Ph = Pai j= I, oy ae ag nN). (109) 
In this case the limiting absolute probabilities Pr» Pas ay P, do not 


depend on the initial probabilities Py Pos see Pa 


Conversely, p is independent of p on account of (107) if and only if all 
the rows of P® are equal, i.e., 


Prj = Pai (h, j= 1, 2, ..., 0) 


so that (by Theorem 11) P is a fully regular matrix. 
If P is primitive, then P® > O and hence, by (109), 


p>O0 8 (=1,2,...,2). 


Conversely, if all the p; (j= 1, 2,.... 7) are positive and do not depend 
on the initial probabilities, then all the elements in every column of P® are 
equal and by (109) P* > Q, and this means by Theorem 11 that P is primi- 
tive, 1.e., that the given chain is acyclic. 

From these remarks it follows that Theorem 11 can also be formulated 
as follows: 


THEOREM 11’: 1. In a homogeneous Markov chain all the limiting abso- 
lute probabilities exist for arbitrary initial probabilities of and only wf the 
chain 1s regular. 

2. In a homogeneous Markov chain the limiting absolute probabilitres. 
exist for arbitrary initial probabilities and are independent of them 1f and 
only if the cham 1s fully regular. 

3. Ina homogeneous Markov chain positive limiting absolute probabih- 
ties exist for arbitrary imtial probabilities und are indenendent of them if 
and only if the chain ts acyclic.** 


5. We now consider a homogeneous Markov chain of general type with a 
matrix P of transition probabilities. 


48 The second part of Theorem 11’ is sometimes called the ergodic theorem and the 
first part the general quasi-ergodic theorem for homogeneous Markov chains (see [4], 
pp. 473 and 476). 


96 XVI. Matrices with Non-NecaTIVE ELEMENTS 


We choose the norma} form (69) for P and denote by hy, ho, ..., hg the 
indices of imprimitivity of the matrices A;, As, ..., Ag in (69). Let h be 
the least common multiple of the integers h,, ho, ...,h,. Then the matrix 
P* has no characteristic values, other than 1, of modulus 1, i.e., P* is regular ; 
here A is the least exponent for which P* is regular. We shall call h the 
perwd of the given homogeneous Markov chain. 

Since P* is regular, the limit 

lim P4 =(P*')- 


qc 
exists and hence the limits 
SS amas Pr+dh = Pr ( Phyo (r=0,1,...,4—1) 
also exist. 
Thus, in general, the sequence of matrices 
|e are are 


splits into h subsequences with the limits P, =Pr(P*) 2“ (r=0,1,. yh —1l1). 
When we go from the transition probabilities to the absolute. sfobabil: 
ties by means of (106), we find that the sequence 


1 2 3 
P, D, Ps +--+. 


splits into h subsequences with the limits 


+qi 
lim’ p =(P™)"p (r=0,1,2,...,h—1). 
qg~> oo 
For an arbitrary homogeneous Markov chain with a finite number of 
states the limits of the arithmetic means always exist : 


P =lim LS pal (ey Pte > Ph!) (P*)~ (110) 


N00 k=l 


and 
ee eee ee 
p=lim 7 D p=P'p. (110’) 
Here P=|| Bi; ||t and = (Px, Ba, ..., Ba). The values B,; (i, j= 1, 2, 


3,..., ”) and p; (j= 1, 2,..., n) are called the mean lamating transition 
probabilitres and mean limiting absolute probabilities, respectively. 


§ 7, Limitine PRoBaBILitTIEs ror Markov CHAIN 97 


Since | a oo 
we have pp=s 
and therefore, by (110’), 
P'p = 5; (111) 


ie., p is a characteristic vector of P’/fordA=1, 
Note that by (69) and (110) we may represent P in the form 


A,-0...0 
0 A,. . .00 

PS ey ee ck He ; 
00...A, 
U UW 


where 
ee: Gear ae a ee 
A,=lim 5 D4; (¢=1,2,....9) W=lim = DW’, 
NV =» oo k=l 
Ayii9...0 
wal * Acts: *) 
* *« #«. .A, 


Since all the characteristic values of W are of modulus less than 1, we 
have 


lim W* =O, 
and therefore W =O. _ 
Hence 
A,O...0 
_ 0A,...00 
Peal ey w Fe (112) 
00...A, 
U O 
Since P is a stochastic matrix, the matrices A,, Ao, ..., Ag are also 


stochastic. 


98 XIII. Matrices with Non-NEGATIVE ELEMENTS 


From this representation of P and from (107) is follows that: The mean 
limiting absolute probabilities corresponding to non-essential states are 
always zero. 

If g=1 in the normal form of P, then 4=1 is a simple characteristic 
value of PT. 

In this case p is uniquely determined by (111). and the mean limiting 
probabilities $1, Bo, ..., Dn do not depend on ,the initial probabilities 


Pi Pos eas Pn. Conversely. if » does not depend on ». then P is of rank 1 
by (110’). But the rank of (112) can be 1 only if g=1. 


We formulate these results in the following theorem :*9 

THEOREM 12: Foran arbitrary homogeneous Markov chain with period 
h the probability matrices P* and b tend toa periodic repetition uith period 
h for k— 0; moreover, the mean hmiting transition probabilities and the 
absolute probabilities P = || Bis |} and b= (Pi. Po..... Bn) defined by (110) 
and (110’) always exist. 

The mean absolute probabilities corresponding to non-essential states are 
always zero. 

If g=17n the normal form of P (and only in this case), the mean limit- 
ing absolute probabiltties pi, Po, ..., Pn are independent of the initial proba- 


bilities Py Do ee Pa and are uniquely determined by (111). 


§ 8. Totally Non-negative Matrices 


In this and the following sections we consider real matrices in which not. 
only the elements, but also all the minors of every order are non-negative. 
Such matrices have important applications in the theory of small oscilla- 
tions of elastic systems. The reader will find a detailed study of these 
matrices and their applications in the book [17]. Here we shall only deal 
with some of their basic properties. 


1. We begin with a definition : 
Derinition 5; .4 rectangular matrix 
Az=fia,|| (¢=1,2,...,m; k=1, 2, ..., n): 
as called totally non-negative (totally positive) if all its minors of any order 


are non-negative (positive) : 


49 This theorem is sometimes called the asymptotic theorem for homogeneous Markov 
chains. See [4], pp. 479-82. 


§ 8. ToTaLLy NON-NEGATIVE MATRICES 99 


er 
A{* ?°"" ?\so 
2 pe es 


(hy tg Corea 
( a : P smip=1,2, ..., min (m,n)). 
k<ka<-:-<k, 


In what follows we shall only consider square totally non-negative and 
totally positive matrices. 


Example 1. The generalized Vandermonde matriz 
=|[aF|I2 (Oca < ae <... < ay3 01 < a2 <<... < a) 


is totally positive. Let us show first that | V | 540. Indeed, from | V|=0 
it would follow that we could determine real numbers ¢;, ce, ..., Cn, not all 
equal to zero, such that the function 


f(z) = cyt (a; a; for tj) 
=1 


has the zeros 1;= a (t=1, 2,..., 7), where n is the number of terms in 
the above summand. For n=1 this is impossible. Let us make the induc- 
tion hypothesis that it is impossible for a sum of m, terms, where n; < n, 
and show that it is then also impossible for the given function f(z). Assume 
the contrary. Then by Rolle’s Theorem the function f,(x) =[2~"f(2)}’ 
consisting of n — 1 terms would have n — 1 positive zeros, and this contra- 
dicts the induction hypothesis. 

Thus, | V| +0. But for a;=0, ag=1,..., a,=n — 1 the determinant 
| V | goes over into the ordinary Vandermonde determinant | a*~*|?, which 
is positive. Since the transition from this to the generalized Vandermonde 
determinant can be carried out by means of a continuous change of the 
exponents a), @2,..., @, with presérvation of the inequalities ay < ag <<... 
< a,, and since, by what we have shown, the determinant does not vanish 
in this process, we have | V | > 0 for arbitrary 0 < a1 < ag <<... < Gn. 

Since every minor of V can be regarded as the determinant of some gen- 
eralized Vandermonde matrix, all the minors of V are positive. 


Ezample 2. We consider a Jacobi matrix 


a, b, 0 0 0 
C Gy db, 0 0 
= 0c, a3... 90 0 (113) 


e ee @e@ e @ $@®  j@e@ @ © 


100 XIII. Matrices with Non-NeGaTive ELEMENTS 


in which all the elements are zero outside the main diagonal and the first 
super-diagonal and sub-diagonal. Let us set up a formule chat expresses an 
arbitrary minor of the matrix in terms of principal minors and the elements 
b,c. Suppose that 


t=" Wie 


Thy kes "<ik, 
and 


= hy, i= kes oo ees 1, = ky, > ty 41% ky 41 ee ty ty k,,3 +1 %+1> 0 @ ey 1,4; oor ? 


then 
i en ey, eee ay i, ‘ia ice ty 
a are ad ham Ld le ale a Pins Bar 
ley Bey oo ky Kinceky) Nin g4 by} Kass ky 
This formula is a consequence of the easily verifiable equation: 
( are | ae | $ $, + 
a; ")=a(2 ; oe acs Pecan 4 (for 1, ~k,). 115 
ky... kp Keg ee Kgg) ep) lpg oes ep Nee AEP 
From (114) it follows that every minor is the product of certain prin- 
cipal minors and certain elements of J. Thus: For J to be totally non- 


negative it 1s necessary and sufficient that all the principal minors and the 
elements b, c should be non-negative. 


2. A totally non-negative matrix A= | Qin | always satisfjes the follow- 
ing important determinantal inequality :°° 


fo2...% 1 2...p p+l...n 
A <A A 
( Dt \s ( el! ne (p<n) (116) 


Before deriving this inequality, we prove the following lemma: 
Lemma d: If in a totally non-negative matriz A= | dix ik any prin- 
cipal minor vanishes, then every principal minor ‘bordering’ 1t also vanishes. 


Proof. The lemma will] be proved if we can show that for a totally non- 
negative matrix A= | dig [2 it follows from 


-_—-—— ——  -. 


°° See [172] and {17], pp. 111ff, where it is also shown that the equality sign in 
(118) ean only hold in the following obvious cases: 

1) One of the factors on the right-hand side of (116) is zero; 

2) All the elements ay, (t= 1, 2,...,p; K=p4+]1,...,n) ora (G=p4+l,...,%; 
kK==1,2,...,p) are zero. 

The inequality (11¢) has the same outward form as the generalized Hadamard inequal- 
ity see (33), Vol. I, p. 255) for a positive-definite hermitian or quadratic form. 


§ 8. TotaLuy NON-NEGATIVE MATRICES 101 


1 2...q 
A =0 (¢q<n) (117) 
1 2...¢q 
that 
A a a Bek "\ =“ a 
l1 2...%” oT) 
For this purpose we consider two cases : 
1) a4:=0. Since Pi 1 | — aya = = 0, ay = 0, ay, 20 (1,k =2,. 
n), either all the ep (1=2,..., ») or all the a;,=0 (K=2,..., n). 


These equations and a;;—=0 imply (118). 
2) a1130. Then for some p (l= pq) 


l 2 eee —] : ] 2 eee —] 

| Pp" \40, 4( oe ?) =o. (119) 
1 2...p—1 1 2...p—1 p 

We introduce bordered determinants 


dy=A( Weg ae ‘| i k=0,9+1 ) 120 
BTN Bec medi, py Were penser re) 


and form from them a matrix D= || dy |{p. 
By Sylvester’s identity (Vol. I, Chapter II, § 3), 


== ge ee , , ‘ 

=[a( 2... ; a( BoP tee) gy ony 
1 2...p—1 1 2...9—-1 by by..ek 
( ty <tg Soot 


9 
<n; =1,2,....n—p+4l]1)], 
= hy Shy <ech, e . 


so that D is a totally non-negative matrix. 
Since by (119) 


the matrix D falls under the case 1) and 


— oe ee 
D(? pate Mela, Zone Dp i) Al; 2 "\=0. 
» o+1...%” 1 2...n9—1 1 2...” 


102 XIII. Marrices with Non-NEGATIVE ELEMENTS 
... p-l 
Since A [ : a a) ~0, (118) follows, and the lemma is proved. 


3. We may now assume in the derivation of the inequality (116) that all 
the principal minors of A are different from zero, since by Lemma 5 one of 
the principal minors can only be zero when | A | =0, and in this case the 
inequality (116) is obvious. 

For n= 2, (116) can be verified immediately : 


1 2 
A ( |= 211 gq — 41949) S 4119, 


since a1. = 0, de: = 0. Weshall establish (116) for » > 2 under the assump- 
tion that it is true for matrices of order less than n. Moreover, without loss 
of generality, we may assume that p > 1, since otherwise by reversing the 
numbering of the rows and columns we could interchange the roles of p 
and ” — p. 

We now consider again the matrix D = || du. lp, Where the du. (1,k =p, 
p+i1,...,n) are defined by (120) ; we use Sylvester’s identity twice as well 
as the basic inequality (116) for matrices of order less than n and obtain: 


D(? Pra... ) dypD(PT**) 
4(* y= p Je _\p pti... n} Ja aN 
I 2...” 4(; 2. yr mi 2. 7) a 
I ee ] ne 
A(, aly 2...p—l1 eee 
_ \l 2...p 12...p—1 ptl..in 
4(, ge ae 
1 2... p—l 


| a ee +1... 
< 4( r\a(? "\. (122) 
1 2...) ptl...n 
Thus, the inequality (116) has been established. 
Let us make the following definition : 


DEFINITION 6. A minor 


a(” te — Ces ree t, <*| (123) 
sax hy Shy << kh, 


sf the matrix A= | ix I) 1 will be called almost principal if of the differences 
41 — hy, te — ka, ... , 4p — ty only one is not zero. 


§ 9. OscrLLAtTorY MATRICES 103 


We can then point out that the whole derivation of (116) (and the proof 
of the auxiliary lemma) remain valid if the condition ‘A is totally non- 
negative’ is replaced by the weaker condition ‘all the principal and almost 
principal minors of A are non-negative. ’*? 


§ 9. Oscillatory Matrices 


1. The characteristic values and characteristic vectors of totally positive 
matrices have a number of remarkable properties. However, the class of 
totally positive matrices is not wide enough from the point of view of appli- 
cations to small oscillations of elastic systems. In this respect, the class of 
totally non-negative matrices is suffiently extensive. But the spectral 
properties we need do not hold for all totally non-negative matrices. Now 
there exists an intermediate class (between that of totally positive and that 
of totally non-negative matrices) in which the spectral properties of totally 
positive matrices are preserved and which is of sufficiently wide scope for 
the applications. The matrices of this intermediate class have been called 
‘oscillatory.’ The name is due to the fact that oscillatory matrices form the 
mathematical apparatus for the study of oscillatory properties of small vibra- 
tions of elastic systems.*? 


DEFINITION 7. A matrix A= || ax ||f ts called oscillatory if A ts total” ; 
non-negative and tf there exists an integer qg > 0 such that A? rs totally 
positive. 

Example. A Jacobi matrix J (see (113)) is oscillatory if and only if 
1. all the numbers b, c are positive and 2. the successive principal minors are 
positive : 


51 See [214]. We take this opportunity of mentioning that in the second edition of the 
book [17] by F. R. Gantmacher and M. G. Krein a mistake crept in which was first 
pointed out to the authors by D. M. Kotelyanskii. On p. 111 of that book an almost 
principal] minor (123) was defined by the equation 


P 


Spa khial. 


vm) 


With this definition, the inequality (116) does not follow from the fact that the principal 
and the almost prineipal minors are non-negative. However, all the statements and proofs 
of § 6, Chapter II in {17] that refer to the fundamental inequality remain valid if an 
almost principal minor is defined as above and as we have done in the paper {214]. 


52 See [17], Introduction, Chapter III, and Chapter IV. 


104 XIII. Marrices wity Non-NEGATIVE ELEMENTS 
a, 6, 0... 0 0 


a, by a Oo, 90 ¢, @ b&b ... 09 0 
a,>0, a [oO [o 92 22) > 9% -25 10 cy as... 0 O | >0. (124) 
,; OG Gy iw te we ee oe ae el 
0 0 0 eee Cy Ay, | 


Necessity of 1., 2. The numbers b, c are non-negative, because J = O. 
But none of the numbers b, c may be zero, since otherwise the matrix would 
be reducible and then the inequality J? > O could not hold for any q > 0. 
Hence, all the numbers b, c are positive. Al} the principal minors of (124) 
are positive, by Lemma 5, since it follows from | J | = 0 and | J?7| > 0 that 
|J|>9. 

Sufficiency of 1., 2. When we expand | J | we easily see that the num- 
bers b, c occur in | J | only as products by¢1, Doce, ..., ba—1¢n—-1. The same 
applies to every principal minor of ‘zero density,’ 1.e., a minor formed from 
successive rows and columns (without gaps). But every principal minor of 
J is a product of principal minors of zero density. Therefore: In every prin- 
cipal minor of J the numbers b and c occur only as products bi cy, bece, ..., 
bn—1€n—1. 

We now form the symmetrical Jacobi matrix 


a, b, 0 
b, a, b 
Be a 
i= soe » b=Vbe>0 =1,2,...,n). (125) 
° Bes 
0 be * Gy 


From the above properties of the principal minors of a Jacobi matrix it 
follows that the corresponding principal minors of J and J are equal. But 
then (124) means that the quadratic form | 


J (2, x) 


is positive definite (see Vol. I, Chapter X, Theorem 3, p. 306). But in a 
positive-definite quadratic form al] the principal minors are positive. There- 
fore in J too all the principal minors are positive. Since by 1. all the numbers 
b, c are positive, by (114) all the minors of J are non-negative; 1.e., J is 
totally non-negative. 

That a totally non-negative matrix J for which 1. and 2. are satisfied is 
oscillatory follows immediately from the following criterion for an oscilla- 
tory matriz. 


§ 9. OscrLLaTorY Matrices 105 


A totally non-negative matrix A= || aq ||1 is oscillatory if and only if: 

1) A is non-singular (| A| > 0); 

2) All the elements of A in the principal diagonal and the first super- 
diagonals and sub-diagonals are different from zero (a > 0 for |i—k|=1). 

The reader can find a proof of this proposition in {17], Chapter II, § 7. 


2. In order to formulate properties of the characteristic values and charac- 
teristic vectors of oscillatory matrices, we introduce some preliminary con- 
cepts and notations. 

We consider a vector (column) 


U= (Uz, Ue, -.+, Up_)> 


Let us count the number of variations of sign in the sequence of coordinates 
Uy, U2, ..., Un Of u, attributing arbitrary signs to the zero coordinates (if 
any such exist). Depending on what signs we give to the zero coordinates 
the number of variations of sign will vary within certain limits. The 
maximal and minimal number of variations of sign so obtained will be de- 
noted by S; and S,, respectively. If S~ = Sj, we shall speak of the exact 
number of sign changes and denote it by S,. Obviously S; = S+ if and only 
if 1. the extreme coordinates u,; and u, of « are different from zero, and 
2.u,=—0 (1 <1< n) always implies that uj;%441 < 0. 
We shall now prove the following fundamental theorem: 


THEOREM 13: 1. An oscillatory matrix A= || au, ||} always has n dis- 
tinct pusitive characteristic values 


Ay > hg >+*s >A, >. (126) 


2. The characteristic vector “= (t11, Uo1, ..-, Uni) of A that belongs 
to the largest characteristic value A, has only non-zero coordinates of like 


sign; the characteristic vector u= (U2, Ue22, ...5 Ung) that belongs to the 
second largest characteristic value Ap has exactly one variation of sign in ats 


coordinates ; more generally, the characteristic vector v= (Uik, Wars...) Unk) 
that belongs to the characteristic value A; has exactly k — 1 variations of sign 
(A=1,2,...,%). 

3. For arbitrary real numbers cy, Cyg41, .--, Co9 (IS GSh=_n; 


~ C, > 0) the number of variations of sign in the coordinates of the vector 
=9 

h 
w= S'o,u (127) 


k=g 


lies between g—1 andh—1: 


106 XIII. Matrices with Non-Ngcativs isLEMENTS 


\ g—-1lsSi7sSish-1. (128) 
Proof. 1. We number the characteristic values A;, 42, ..., 4n of A sO 
that 
[Ay] jag] S-:-S'A,| 
and consider the p-th compound matrix Y%, (p= 1, 2,..., ) (see Chapter I, 


§ 4). The characteristic values of YM, are all the possible products of p 
characteristic values of A (see Vol. I, p. 75), i.e., the products 


AyAg ++ Ay, Ay Ags Ap_yApais 


From the conditions of the theorem it follows that for some integer gq A? 
is totally positive. But then %, =O, M7 > O;* i.e. M, is irreducible, non- 
negative, and primitive. Applying Frobenius’ theorem (see § 2, p. 40) 
to the primitive matrix M, (p=1, 2,..., »), we obtain 


Ayhgst+A, > 0 (p=1, 2, ...,”), 
Ay Ages hy > AyAgs + Ap_yAnis | (p=1,2,...,n—1). 


Hence (126) follows. 

2. From this inequality (126) it follows that A= || ai |li is a matrix 
of simple structure. Then all the compound matrices A, (n=1, 2,..., ”) 
are alsb of simple structure (see Vol. I, p. 74). 

We consider the fundamental matrix U = i Usk in of A (the k-th column 


of U contains the coordinates of the k-th characteristic vector u of A;k= 
1,2,...,). Then (see Vol: I, Chapter ITI, p. 74), the characteristic vector 
of M, belonging to the characteristic value A, A2... A, has the coordinates 


ty te wie . . . ° ’ 
U i (lst <tg<ee+ <4, <n) (129) 
oo D 


By Frobenius’ theorem all the numbers (129) are different from zero 


and are of like sign. Multiplying the vectors ul, u, _..,u by +1, we can 
make all the minors of (129) positive: 


Ge. Sanden ok {V<t, Sige <a, 
u(” ?\>0 oon aa? (130) 
1 2... p=1,2,...,% 


ns 


53 The matrix UZ is the’ p-th compound matrix 4% (see Vol. I, Chapter I, p. 20.) 


' §9. OscmLatTory Matrices 107 
The fundamental matrix U= || wa || is connected with A by the equation 
A=U (hy Ag cacy Ay) Us (131) 
But then 
AP SODA phasic d lO": (132) 
Comparing (131) with (132), we see that 
v=(y | (133) 


is the fundamental matrix of A’ with the same characteristic values Aj, do, 
., An. But sinee A is oscillatory, so is A‘. Therefore in V as well for 
every p= 1, 2,..., m all the minors 


v (3 ee (lSip<ig<-+-<i,<n) (134) 
ae 


are different from zero and are of the same sign. 
On the other hand, by (133) U and V are connected by the equation 


U'V=E. 


Going over to the p-th compound matrices (see Vol. I, Chapter I, § 4), we 
have: 


UB, =E,. 


Hence, in particular, noting that the diagonal elements of (5, are 1, we obtain: 


u(" ae v(* heed ae (135) 
1Si)<ig<-+- <ipsn | ee ar I 2... 


On the left-hand side of this equation, the first factor in each of the sum- 
mands is positive and the second factors are different from zero and are of 
like sign. It is then obvious that the second factors as well are positive; 1.e., 


ty Basie tee. 2 lst <ig<eee<i,sn 
v(3 ") $0 -_ : eas (136) 
1 2 ...p WN Dn 


Thus, the inequalities (1380) and (136) hold for U= | Wik | and 
V = (U*)—?! simultaneously. 


108 XIII. Marricts with Non-NEGATIVE ELEMENTS 


When we express the minors of V in terms of those of the inverse matrix 
V~-1!1= UT by the well-known formulas (see Vol. I, pp. 21-22), we obtain 


Pp 
ae oe re np+ 3 4, 3 ; : 
(" Ja ie In—p \=3 vm} U t; te t, 
12 ...%—p | nn—1...n—pt+l 


where 4)9§ C#<..-< t and ji < jo<...< jn—p together give the com- 
plete system of indices 1, 2,...,”. Since, by (130), | U | > 0 it follows from 
(136) and (137) that 


, (187) 


; *) i (' <ij<ig<---<zi,sn 


Fee ) (138) 


Now let u= y CE u ( > cz > 0). We shall show that the inequalities 
(130) imply tiene oar oe (128) : 
St <h—-1, (139) 
and the inequalities (138), the first part: 
S,2g—1. (140) 
Suppose that St > h—1. Then we can find h + 1 coordinates of u 


Ui iy vey Ungr (lst, <tg<---<t,, Sn) (141) 
such that 
Ui, Mins SO (a=1, 2,..., A). 
Furthermore, the coordinates (141) cannot all be zero; for then we could 
h 
equate the corresponding coordinates of the vector u= >’ Cyth (q4=...= 
h k=l 


64=0% >, cy > 0) to zero and thus obtain a system of homogeneous 
ke) 


equations 
h 
ZX “thas = 0 (w=1, 2,..., A) 

with the non-zero solution ci, Co, ..., Cx, Whereas the determinant of the 
system i) <i ‘ 

ty tg 0. 

U 
Kea: 


is different from zero, by (130). 


§ 9. Oscinuatrory MaTRIcEs 109 


We now consider the vanishing determinant 


fe 808 e@ @ @ e@ ge ge 


Wiis Wah Unis 


We expand it with respect to the elements of the last column: 


s=1 NAS asada het is h 
But such an equation cannot hold, since on the left-hand side all the terms 
are of like sign and at least one term is different from zero. Hence the 
assumption that Sj > h—1 has led to a contradiction, and (139) can be 
regarded as proved. 

We consider the vector 


ke | 
U = (Uyh, Wop «> +r Ung) (k=1, 2,..., »), 


where 
Ug = (—1)* +a (, E=1, 2,..., 7); 


Utz | we have, by (138) : 


then for the matrix U* = | 


Sa Sarre 7 lst, <igpcees C4cn 
uo" (* 2 p J>o ( 1 2 Pp 


) (142) 
mn—1...n—p+l p—1l1,2,...,” 


But the inequalities (142) are analogous to (130). Therefore, by setting 


h 
i= SY (—1)'on? (148) 


ke 


-) 


we have the inequality analogous to (1.~* ™ 
Siesn—g. (144) 
Let “== (1, U2,..+, Un) and u* = (uj, uz,..., un). It is easy to see that 
u; = (—1)'u, (#=1, 2,..., ”). 


Therefore 


54 In the inequalities (142), the vectors Ff (k= 1, 2,...,%) oceur in the inverse order 


g . 
era ... The vector & is preceded by n — g vectors of this kind. 


110 XIII. Marprices with Non-NEGATIVE ELEMENTS 


Sie+S,=n—1, 


and so the relation (140) holds, by (144). ‘ 

This establishes the inequality (128). Since the second statement of the 
theorem is obtained from (128) by setting g=h=k, the theorem is now 
completely proved. 


3. As an application of this theorem, let us study the small oscillations of 
M MASSES M1, Mo, ..., mM, concentrated at n movable points 21 << 12 <<... < In 
of a segmentary elastic continuum (a string cr a rod of finite length), 
stretched (in a state of equilibrium) along the segment 0=2r/ of the 
L-axis. 

We denote by K(2z,s) (02,8 1) the function of influence of this 
continuum (K(2z,s) is the displacement at the point z under the action of a 
unit force applied at the point s) and by kj; the coefficients of influence for 
the given masses: 


kK (z., x) (t, j=1, 2, ..., 2). 
If at the points 2, ro,..., 2, n forces F;, Fo,..., F, are applied, then 


the corresponding static displacement y(z) (0 =z 1), is given, by virtue 
of the linear superposition of displacements, by the formula 


y(2)= 2K (x, x5) F,. 


3 
When we here replace the forces F; by the inertial forces — my Sey (2%, t) 


(7=1,2,...,), we obtain the equation of free oscillations 
n gr 
y (x)= — mK (x, &) say (24, #). (145) 
j= 


We shall seek harmonic oscillations of the continuum in the form 
y(z)=u(z)sin(wt+a) (OSes). (146) 
Here u(x) is the amplitude funetion, w the frequency, and a the initial 


phase. Substituting this expression for y(z) in (145) and cancelling 
sin (wt + a), we obtain 


u (x) = w? ~ m,;K (x, 2;) (2,). (147) 


§ 9. OscmLLATORY MATRICES 111 


Let us introduce a notation fur the variable displacements anid the dis- 
placements in amplitude at the points of distribution of mass: 


y¥,—y(2,t), u=ul(z) (0 =], 2, +02, %). 


Then 
y,=u,sin(wt-+a) (¢=1,2,...,0). 


We also introduce the reduced anplitude displacements and the reduced 
coefficients of influence 


a,= Vm, u,, ay = ¥m,m;k,; (9 1,2. 3 Mn): (148) 


Replacing z in (147) by a (2=1, 2, ..., m) successively, we obtain a 
system of equations for the amplitude displacements : 


SD ayit; = 2G, a= i==1,2,...,7). (149) 


a> 
jal 4 


Hence it is clear that the amplitude vector u =(u,, %,,..., w,) is a charac- 
teristic vector of A =|| a, ||| =|| Vmm,k,, ||} for A = 1/w? (see Vol. I, Chapter 
X, § 8). 

It can be established, as the result of a detailed analysis,** that the matrix 
of the coefficients of influence || k,; ||1 of a segmentary continuum ts always 
oscillatory. But then the matrix A= || a, ||? =|| Vm,m,k,, ||? is also oscilla- 
tory! Therefore (by Theorem 13) A has n positive characteristic values 


Ay > Ag > ees >A, > 0; 


i.e., there exist harmonic oscillations of the continuum with dtstinct 
frequencies : 


(0 <) w,<a,.<-+'<a, (A= ae i=1,2,...,n). 
3 
By the same theorem to the fundamental frequency a, there correspond 
amphtude displacements different from zero and of like sign. Among the 
displacements in amplitude corresponding -to the first overtone with the 
frequency we, there is exactly one variation of sign and, in general, among 
the displacements in amplitude for the overtone with the frequency w,; there 
are exactly 7 — 1 variations of sign (j= 1, 2,..., ). 


55 See [239], [240], and [17], Chapter ITI. 


112 XIII. Matrices with Non-Negative ELEMENTS 


From the fact that the matrix of the coefficients of influence || Keys ie 
is oscillatory there follow other oscillatory properties of the continuum : 
1) For w=, the amplitude function u(x), which is connected with the 
amplitude displacements by (147), has no nodes; and, in general, for w = a); 
the function has j — 1 nodes (j= 1, 2,...,);2) The nodes of two adjacent 
harmonics alternate, ete. 

We cannot dwell here on the justification of these properties.*® 


66 See [17], Chapters III and IV. 


CHAPTER XIV 


APPLICATIONS OF THE THEORY OF MATRICES 
TO THE INVESTIGATION OF SYSTEMS OF 
LINEAR DIFFERENTIAL EQUATIONS 


§ 1. Systems of Linear Differential Equations with Variable 
Coefficients. General Concepts 


1. Suppose given a system of linear homogeneous differential equations of 
the first order: 


SH — Spalt)a, (6=1,2,....7), (1) 
kal 


where py(t) (4,4 =1, 2,..., 2) are complex functions of a real argument f, 
continuous in some interval, finite or infinite, of the variable ¢.’ 
Setting P(t) = | Dix(t) i and z= (2, Ze, ..., Zn), We write (1) as 


dz 
=P ie. (2) 
An integral matriz of the system (1) shall be defined as a square matrix 
X(t) = | Ly, (t) || whose columns are 7 linearly independent solutions of 
the system. 


Since every column of X satisfies (2), the integral matrix X satisfies the 
equation 


aX 


In what follows, we shall consider the matrix equation (3) instead of 
the system (1). 

From the theorem on the existence and uniqueness of the solution of a 
system of differential equations? it follows that the integral matrix X(t) 
is uniquely determined when the value of the matrix for some (‘initial’) 


1 In this section, all the relations that involve functions of t refer to the given interval. 


2A proof of this theorem will be given in § 5. See also J. G. Petrowski (Petrovskii), 
Vorlesungen tiber die Theorie der gewédhnlichen Differentialgleichungen, Leipzig, 1954 
(translated from the Russian: Moscow, 1952). 


113 


114. XIV. AppLicaTIONS TO SYsTEMs OF LINEAR DIFFERENTIAL EQUATIONS 


value ¢ = ¢,) is known,’? X(t.) =X. For X, we can take an arbitrary non 
singular square matrix of order n. In the particular case where X(t.) = E, 
the integral matrix X(#) will be called normalized. 

Let us differentiate the determinant of X by differentiating its rows in 
succession and let us then use the differential relations 


adx;; id e 
ae =D purty (+,7=1,2,..., 0). 
kewl : Os a : ae ; 
We obtain: 
d ».¢ . ¢ rea : : ‘ 
aed = (Pir + Pag + +++ + Pan) |X] - 
Hence there follows the well-known Jacobi identity 
| 
f tr Pdt ; 
[X]=ce* (4) 
where c is a constant and 
tr P= pi1 + poo t+... + Dan 
is the trace of P(t). 
Since the determinant | X | cannot vanish identically, we have c +0. 


But then it follows from the Jacobi identity that | Y | is different from zero 
for every value of the argument 


|X | 40; 


Le., an wntegral matrix is non-singular for every value of the argument. 
If X(t) is a non-singular (| X(t) | 0) particular solution of (3), then 
the general solution is determined by the formula 


XX... (5) 


where C’ is an arbitrary constant matrix. 
For, by multiplying both sides of the equation 


aX ~ 
ap = PX (6) 


by C on the right, we see that the matrix XC also satisfies (3). On the other 
hand, if X is an arbitrary solution of (3), then (6) implies: 


3 It is assumed that t) belongs to the given interval of f. 


§ 1. System ' 115 
S OF Linear DirrERENTIAL EQuaTIONS 


dX 
a= 
and hence by (3) 


—(X-X-! —_ fie oS ~ = v- 
aA AO X)= IK ES (Fay) = px 4 FE) 


dx 
dt (X71X)=0 
and 
X 1X = const.=C ; 

i.e., (5) holds. 

All the integral matrices X of the system (1) are obtained by the formula 
(5) with | C | 0. 
2. Let us consider the special case: 


dX 7 
qe = AX, .. (7) 


where A is a constant matrix. Here ¥ =e“ is a particular non-singular 
solution of (7),* so that the general solution is of the form 


X =e4*C : (8) 


where C is an arbitrary constant matrix. 
Setting t= ¢, in (8) we find: XY, = e4%C. Hence C = e~4% X, and there- 
fore (8) can be represented in the form 


».¢ = ¢4t-W)X,, (9) 
This formula is equivalent to our earlier formula (46) of Chapter V (Vol. I, 


p. 118). 
Let us now consider the so-called Cauchy system: 


- = 4.x (A is a constant matrix). | (10) 

This case reduces to the preceding one by a change of argument: 
a = In (t—a). 

Therefore the general solution of (10) looks as follows: 


X= e4int-9C —(t—a)4C. (11) 


The functions e4¢ and (t — a)4 that occur in (8) and (11) may be repre- 
sented in the form (Vol. I, p. 117) 


: ae . AK 
4By term-by-term differentiation of the series e4t = 2 a tk we find 5 e4t — AeAt. 


EanQ 


116 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 
8 
e4t = DT (Zy1 + Zygt + 22+ + Dem, te 1) (12) 
kel 
(t—a)4 = D' (2, + Zn (t—a) +--+ + Zim, {in (t — a)]™*—*) (a). (13) 
keol 


Here 
yp (A) = (A—Ag)™ (A — Ag) (A—A)™ 
(A; A, for 154 k31,k=1,2,...,8) 


is the minimal polynomial} of A, and Z;; (j=1, 2,..., me; k=1, 2,..., 8) 
are linearly independent constant matrices that are polynomials in A. 

Note. Sometimes an integral matrix of the system of differential equa- 
tions (1) is taken to be a matrix W in which the rows are linearly independ- 
ent solutions of the system. It is obvious that W is the transpose of X: 


W=X". 


When we go over to the transposed matrices on both sides of (3), we 
obtain instead of (3) the following equation for W: 


aw , 
qe WP lh). (3°) 


Here W is the first factor on the right-hand side, not the second, as X was 
in (3). 


§ 2. Lyapunov Transformations 


1. Let us now assume that in the system (1) (and in the equation (3)) 
the coefficient matrix P(t) = | pix (t) I) is a continuous bounded function 
of ¢ in the interval [%), «).® 

In place of the unknown functions 2, Zo, ..., Z, we introduce the new 
unknown functions ¥1, Y2,..., Yn bv means of the transformation 


x, = 2 la(t) y, (s=1,2,..., 2). (14) 


5 Every term Xx—= (Zp, + Zyt + > ++ + Zaemyl™k—}) ek! (Kk = 1, 2,..., 8) on the 
right-hand side of (12) is a solution of (7). For the product g(A)e4?, with an arbitrar) 
function g(A), satisfies this equation. But X, =f(A)=g(A)e4t if f(A) = g(A) edt anc 
g(A*) =1, and all the remaining m—1 values of g(A) on the spectrum of A are zert 
(see Vol. I, Chapter V, formula (17), on p. 104). 

6 This means that each function pie(t) (i, k=1, 2,..., 2) is continuous and bounde< 
in the interval [to,00), i.c., ¢= te. 7 


§ 2. LyapuNov TRANSFORMATIONS Pies LT 


We impose the following restrictions on the matrix L(t) = || la(t) ik 
of the transformation : 


; ere oo : 
1. L(t) has a continuous derivative - in the interval [to, 0) ; 
2. L(t) and - are bounded in the interval [fo, «0 ) ; 


3. There exists a constant .m such that 
0< m < absolute value of | Z(¢) | (t= ty), 


i.e., the determinant | Z(t) | is bounded in modulus from below by the posi- 
tive constant m. 


A transformation (14) in which the coefficient matrix L(t) = 1 Lin(t) |i 
satisfies 1.-3. will be called a Lyapunov transformation and the correspond- 
ing matrix L(t) a Lyapunov matric. . 

Such transformations were investigated by A. M. Lyapunov in his 
famous memoir ‘The General Problem of Stability of Motion’ [32]. 

Examples. 1. If L=const. and | L | 0, then L satisfies the conditions 
1.-3. Therefore a non-singular transformation with constant coefficients is 
always a Lyapunov transformation. 


2. If D= | dix 7 is a matrix of simple structure with pure imaginary 
characteristic values, then the matrix 
L(t) =e? 


satisfies the conditions 1.-3. and is therefore a Lyapunov matrix.’ 


2. It is easy to verify that the conditions 1.-3. of a matrix Z(t) imply the 
existence of the inverse matrix L—'(t) also satisfying the conditions 1.-3. ; 
i.e., the inverse of a Lyapunov transformation is itself a Lyapunov trans- 
formation. In the same way it can be verified that two Lyapunov transfor- 
mations in succession yield a Lyapunov transformation. Thus, the Lyapunov 
transformations form a group. They have the following important property : 


If under the transformation (14) the system (1) goes over into 


d n 
= D> Gel!) (18) 
k=1 


und uf the zero solution of this system is stable, asymptotically stable, or 
unstable in the sense of Lyapunov (see Vol. I, Chapter V, § 6), then the zero 
olution of the original system (1) has the same property. 


7 Here all the m, = 1 in (12) and Ay = Mp; (9, real, E=1, 2,..., 8). 


118 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


In other words, Lyapunov transformations do not alter the character 
of the zero solution (as regards stability). This is the reason why these 
transformations can be used in the investigation of stability in order to 
simplify the original system of equations. 

A Lyapunov transformation establishes a one-to-one correspondence be- 
tween the solutions of the systems (1) and (15); moreover, linearly inde- 
pendent solutions remain so after the transformation. Therefore a Lyapunov 
transformation carries an integral matrix X of (1) into some integral 
matrix Y of (15) such that 


X=L()Y. (16) 
In matrix notation, the system (15) has the form 


“=Q(0Y, a7 


where Q(t) = | dix (t) ik is the coefficient matrix of (15). 
Substituting LY for X in (3) and comparing the equation so obtained 
with (17), we easily find the following formula which expresses Q in terms 


of P and L: 


Q=DPL-1P& 


(18) 

Two systems (1) and (15) or, what is the same, (3) and (17) will be 
called equivalent (in the sense of Lyapunov) if they van be earried into one 
another by a Lyapunov transformation. The coefficient matrices P and Q 
of equivalent systems are always connected by the formula (18) in which 
L satisfies the conditions 1.-3. 


§ 3. Reducible Systems 


1. Among the systems of linear differential equations of the first order the 
simplest and best known are those with constant coefficients. It is, there- 
fore, of interest to study systems that can be carried by a Lyapunov trans- 
formation into systems with constant coefficients. Lyapunov has called such 
systems reducible. 

Suppose given a reducible system 


dX 
= PX. (19, 
Then some Lyapunov transformation 
X=L(t)Y (20) 


earries it into a system 


-§ 3. REDUCIBLE SYSTEMS 119 


-- = AY, we Seger Heng 4 fase (21) 
where A is a constant matrix. Therefore (19) has the particular solution 
X=L (be. (22) 


It is easy to see that, conversely, every sys’em (19) with a particular solu- 
tion of the form (22), where Z(t) is a Lyapunov matrix and A a.constant 
matrix, is reducible and is reduced to the form (21) by means of the Lyapu- 
nov transformation (20). 

Following Lyapunov, we shall show that: Every system (19) with 
periodic coefficients ts reducible.® 

Let P(t} in (19) be a continuous function in (— 0, + o) with pendd r: 


P(¢+1%)=Pi(t). (23) 


Replacing ¢ in (19) by ¢+ rand using (23), we obtain: 
aX C+) =P) X (t+). 
Thus, X(¢+ 1) is an integral matrix of (19) if X(t) is. Therefore 


X (t+ =X) V, 


where V is a constant non-singular matrix. Since |V|40, we can 
determine® 


This matrix function of ¢, just like X(t), is multiplied on the right by V 
when the argument is increased by tr. Therefore the ‘quotient?’ 


8 t 
—InvV 


L®=XOYV *=X(He * 
is.continuous and periodic with period tT: 
L(t+1)=L(h), 


and with |Z|340. The matrix L(t) satisfies the conditions 1.-3. of the 
preceding section and is therefore a Lyapunov matrix. 


8 See [32], § 47. 

9 Here nV=f(V), where f(A) is any single-valued branch of InA in the simply- 
connected domain G containing all the characteristic values of V, but not containing 0. 
See Vol. I, Chapter V. 


120 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


On the other hand, since the solution XY of (19) can be represented in 


the form 
nV 


X=Lite™ 


the system (19) is reducible. 
In this case the Lyapunov transformation 


X=L()Y, 
which earries (19) into the form 


dY 


at =k: Y 


has periodic coefficients with period t. 

Lyapunov has established’® a very important criterion for stability and 
instability of a first linear approximation to a non-linear system of differ- 
ential equations 


— G2, + (#8) - (s =], 2,...,%), (24) 


where we have convergent power series in 21, X2,..., Z, on the right-hand 
side and where (**) denotes the sum of the terms of second and higher orders 
in 2%, Zo,..., Zn; the coefficients a, (1,k =1, 2,..., ) of the linear terms 
are constant.”? 


LYAPUNOV’S CRITERION: The zero solution of (24) 1s stable (and even 
asymptotically stable) of all the characteristic values of the coefficient matriz 
A= | Ox || of the first linear approximation have negatwe real parts, and 
unstable rf at least one characteristic value has a positive real part. 


2. The arguments used above enable us to apply this criterion to a system 
whose linear terms have periodic coefficients : 


“e = 2) py, (t) 2, + (#*). (25) 
k=l 


For on the basis of the preceding arguments we reduce the system (25) to 
the form (24) by means of a Lyapunov transformation, where 


10 See [32], § 24. 
11 The coefficients in the non- Aear terms may depend ont. These functional coef fi- 
cients are subject to certain restrictions (see [32], §11). 


§ 4. CaNnonicaL ForM or REDUCIBLE SysTEM. ErRuGIN’s THEOREM 121 
l 
A=||¢g|}= >aV 


and where V is the constant matrix by which an integral matrix of the cor- 
responding linear system (19) is multiplied when the argument is changed 
by r. Without loss of generality, we may assume that t > 0. By the prop- 
erties of Lyapunov transformations the zero solutions of the original and of 
the transformed systems are simultaneously stable, asymptotically stable, 
or unstable. But the characteristic values 4, and y (¢=1, 2,...,) of A 
and V are connected by the formula 


l ‘ 
w= zlny (1=1,2,...,2). 


Therefore, by applying Lyapunov’s criterion to the reduced systems we 
find :*? 

The zero solution of (25) 1s asymptotically stable tf all the characteristic 
values v1, vo,...,%n Of V are of modulus less than 1 and unstable if at least 
one characteristic value ts of modulus greater than 1. 


Lyapunov has established his criterion for the stability of a linear ap- 
proximation for a considerably wider class of systems, namely those of the 
form (24) in which the linear approximation is not necessarily a system with 
constant coefficients, but belongs to a class of systems that he has called 
regular.*® 

The class of regular linear systems contains all the reducible systems. 

A criterion for instability in the case when the first linear approxima- 
tion is a regular system was set up by N. G. Chetaev."* 


§ 4. The Canonical Form of a Reducible System. Erugin’s Theorem 
1.. Suppose that a reducible system (19) and an equivalent system 


adY 


(in the sense of Lyapunov) are giver, where A is a constant matrix. 

We shall be interested in the question: To what extent is the matrix A 
determined by the given system (19)? This question can also be formu- 
lated as follows: 


12 Loe. cit., § 55. 
18 Loc. cit., § 9. 
14 See [9], p. 181. 


122 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


When are two systems 


dY _ dZ \ 
“AY and F=BEZ, \ 


where A and B are constant matrices, equivalent in the sense of Lyapunov; 
2.€., when can they be carried anto one another by a pyapunoe transfor- 
mation? 

In order to answer this question we introduce the notion of matrices with 
one and the same real part of the spectrum. 

We shall say that two matrices A and B of order n have one and the same 
real part of the spectrum if and only if the elementary divisors of A and B 
are of the form 


(A ==) Ay)™, (A a Ao), sey (A =A)"; (A — 4y)™, (A oa be)”, oP ag (A — p,)™ ; 


where 
ReéA,=Rep, (k=1, 2,.-., 8). 


Then the following theorem due to N. P. Erugin holds :** 


THEOREM 1 (Erugin): Two systems 


adY adZ 
GAY and 5 =B2 (26) 


(A and B are constant matrices of order n) are equivalent in the sense of 
Lyapunov tf and only tf the matrices A and B have one and the same real 
part of the spectrum. 

Proof. Suppose that the systems (26) are given. We reduce A to the 
normal Jordan form’® (see Vol. I, Chapter VI, § 7) 


A=T{A,E, + Hy, A,.F,+ He, ..., 4,2, + H,} T, (27) 
where 
Ay = ay, + 2B, (a,, 8, are real numbers; k =1, 2,...,8). (28) 
In accordance with (27) and (28) we set 
A, ie T { a,F, + A, aks + Hyg, oS o89 af, r H,} 7" 
Ag= 1 {1B,E,, ip,H,, ..:, 8, #,} T—. 


18 Our proof of the theorcm differs from that of Erugin. 


(29) 


16 B, is the unit matrix; in Hf, the elements of the first superdiagonal are 1, and the 
remaining elements are zero; the orders of Ex, He are the degrees of the k-th elementary 
divisor of A, i.e., me (K—=1, 2,..., 8). 


§ 4. Canonical Form or REDUCIBLE SYSTEM. ERUGIN’s THEOREM 123 


Then 
A =A, + Ag, A,Ag, = A,A,. (30) 


, We define a matrix L(t) by the equation 
L(t) =e4#, 


L(t) is a Lyapunov matrix (see Example 2 on p. 117). 
But by (30) a particular solution of the first of the systems (26) is of 
the form 
e4t = eAstedt = F(t) e411, 


Hence it follows that the first of the systems (26) is equivalent to 


dU 
= AU, (31) 


where, by (29), the mat~?= A, has real characteristic values and its spec- 
trum coincides with the real part of the spectrum of A. 

Similarly, we replace the second of the systems (26) by the equivalent 

system | | | | 

dV 

where the matrix B, has real characteristic values and its spectrum coincides 
"ith the real part of the spectrum of B. 

Our theorem will be proved if we can show that the two systems (31) 
end (32) in which A, and B, are constant matrices with real characteristic 
values are equivalent if and only if A, and B, are similar.’ 

Suppose that the Lyapunov transformation 


U=L,V 
carries (31) into (32). Then the matrix Z, satisfies the equation 


dL 
Gt = Ay, — L,By. (33) 


This matrix equation for Z, is equivalent to a system of n? differential 
equations in the n? elements of Z,. The right-hand side of (33) is a linear 
operation on the ‘vector’ Z, in an n?-dimensional space 


17 This proposition implies Theorem 1, since the equivalence of the systems (31) and 
(32) means that the systems (26) are equivalent, and the similarity of 4: and B; means 
that these matrices have the same elementary divisors, so that the matrices 4 and B have 
one and the same real part of the spectrum. 


‘ 
124 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


‘ 
#4=F(L,), (Fy) =4,L,—1,B,). (33’) 


Every characteristic value of the linear operator F (and of the corre- 
sponding matrix of order n*) can be represented in the form of a difference 
y — 6, where y is a characteristic value of A, and 6 a characteristic value 
of B,..% Hence it follows that the operator F has only real characteristic 
values. 

We denote by 


P (ay= (A—A)™ (A—A,)™ + (A— 


(the rt are real ; 1, = 4, fori~j;1,j;=1,2,... u) the minimal polynomial 
of F. Then the solution Li(t) = e'L of '(38") can, by formula (12) 
(p. 116), be written as follows: 


L,()= > 5 ‘Tyt &, (34) 
kml ju 
where the L,; are constant matrices of order n. Since the matrix [,(t) is 
bounded in the interval (t), 0), both for every 4, > 0 and for 4, = 0 and 
j > 0, the corresponding matrices L,; =O. We denote by L_(t) the sum of 
all the terms in (34) for which a. <0. Then 


L, ()=L_(t)+ Ly, (35) 


where 
_ A 


lim L_()}=0, lim 


t—> +00 t—> + oo 


=0, L,=const. (35’) 


Then, by (35) and (35’), 
lim 1, (t)= Lp, 


t=» + co 
18 For let do be e any characteristic value of the operator F. Then there exists a matrix 
LO such that F(L) = == Aol., or 
(A; — AnE)L= LB. (*) 
The matrices 4:— 4, and B; have at least one characteristic value in common, since 
otherwise there would exist a polynomial g(4) such that 
9g (A: — AoE) = 0, g(B:) = E, 


and this is impossible, because it follows from (*) that g(41—A.H) * L=Le g(B:) 
‘and D0. Butif 4: — AoE and B, have a common characteristic value, then 4.= »— 6, 
where y and 5 are characteristic values of A: and Bi, respectively. A detailed study of 
the operator F can be found in the paper [179] by F. Golubchikov. 


§ 5. THe Marricanr @ 125 


from which it follows that 
| Lo | 0, 
because the determinant | Z,(¢) | is bounded in modulus from below. 
When we substitute for £,(¢) in (33) the sum D_(t) + Lo, we obtain: 


dL_(t) 
di 


—A,L_(t) + B,L_ (t) = A,Ly— ByLy; 


hence by (357) 
A,L,—L,B,=90 
and therefore | 
B,= 1, A;Lo: (36) 


Conversely, if (36) holds, then the Lyapunov transformation 
U= LV 


carries (31) into (32). This completes the proof of the theorem. 


2. From this theorem it follows that: Every reducible system (19) can be 
carried by the Lyapunov transformation X =LY into the form 


dY 
a =”, 


where J is a Jordan matrix with real characteristic values. This canonical 
form of the system is uniquely determined by the given matrix P(t) to 
within the order of the diagonal blocks of J. 


§ 5. The Matricant 


1. We consider a system of differential equations 


aX 
= PX, (37) 


where P(t) = 1 pix (t) || is 2 continuous matrix function of the argument 
t in some interval (a, b).?® 


19 (a,b) is an arbitrary interval (finite or infinite). All the elements pu(t) (i, k= 
1,2,...,”) of P(t) are complex functions of the real argument ¢, continuous in (a, b). 
Everything that follows remains valid if, instead of continuity, we require (in every finite 
subinterval of (a,b)) only boundedness and Riemann integrability of all. the functions 


(t). 


126 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


We use the method of successive approximations to determine a normal- 
ized solution of (37), i1.e., a solution that for t =f, becomes the unit matrix 
(t) is a fixed number of the interval (a,b) ). The successive approximations 
X;, (k=0.1, 2,...) are found from the recurrence relations 


SSP Oty Waits), 


when X, is taken to be the unit matrix E. 
Setting X;(t)) =H (k=0, 1, 2,...) we may represent X, in the form 


f 
X,=E + P(t) X,_, 4. 
ty 


Thus 


t ¢ ¢ t 
X= EB, X,=E+/[P(c)dt, X,=E+fP(t)dr+/ P(t){ P(o)dodr,..., 
fo te be f 


ie., X;, (k=0, 1, 2,...) is the sum of the first k + 1 terms of the matrix 
series 


B+ [Po Jae [PUe)| Pea \dodr ++ (38) 


In order to prove that this series is absolutely and uniformly co. ‘ergent 
in every closed subinterval of the interval (a, b) and determines the required 
solution of (37), we construct a majorant. 

We define non-negative functions g(t) and A(t) in (a,b) by the equa- 
tions”® 


g (t)= max [| Pir (t)|s | Prat), ---2 | Pan (1, (2) =| fg (x) ar]. 


It is easy to verify that g(t), and consequently h(t) as well, is continuous 
in (a,b).” 

Each of the n? scalar series into which the matrix series (38) splits is 
majorized by the series 


1+ h(t) — () “no Seater (39) 


20 By definition, the value of g(t) for any value of t is the largest of the n’ moduli of 
the values of pie(t) (i,k =1, 2,..., 7) for that value of ft. 

21 The continuity of g(t) at any point 4 of the interval (a,b) follows from the fact 
that the difference g(t) — g(t) for ¢ sufficiently near t: always coincides with one of 
the n* differences | pse(t) | —.| pew (tx) | (4, R= 1, 2,..., 0). 


$5. THe MAtRICANT ee" 127 
For 


é t 
([P(dr)e|=! [reteyde|s| fg) dr|=a(Q), 
bo to 


fo 


t ; t n t t T tT 2 
[([P ix) [P(o)dodz)ix| =| S Jps(t) [Py (a)dode: <n| for) fo(o)dode|= ©, 
bo bo j=1t, be fo to 
ete. 

The series (39) converges in (a,b) and converges uniformly in every 
closed part of this interval. Hence it follows that the matrix series (381 also 
converges in (a,b) and does so absolutely and uniformly in every closed 
interval contained in (a,b). 

By term-by-term differentiation we verify that the sum of (88) is a 
solution of (37); this solution becomes EF for t=¢,. The term-bs-term 
differentiation of (38) is permissible, because the series obtained after dif- 
ferentiation differs from (38) by the factor P and thrrefore, like (3%). is 
uniformly convergent in every closed interval coutained in (a, bv). 

Thus we have proved the theorem on the existence of a normal solution 
of (37). This solution will be denoted by 2), (P) or simply 2). Every 
other solution, as we have shown in § 1, is of the form 


X=20, 


where C is an arbitrary constant matrix. From this formula it follows that 
every solution, in particular the normalized one, is uniquely determined by 
its value for ¢ = tp. 

This normalized solution (2) of (37) is often called the matricant. 

We have seen that the matricant can be represented in the form of a 
series”? 


4 t v 
Q,=E+ | P(r)dev+{ P(t) [ P(s)dodr+---, (40) 
be to te 


which converges absolutely and uniformly in every closed interval in which 
P(t) is continuous. 


2. We mention a few formulas involving the matricant. 


1. Q,= 2,28 (ty, ty t € (a, b)). 


For since 9), and 2) are two solutions of (37), we have 


22 The representation of the matricant in the form of such a series was first obtained 
by Peano [308]. 


128 XIV. APPLICATIONS TO SYSTEMS orf LINEAR DIFFERENTIAL EQUATIONS 


GQ =M,,C (C is a constant matrix). 
Setting t= ?, in this equation, we obtain C = 27. 
2 OF (P+Q) =O), (P)Q,(8) with S=[Q,(P)I QQh(P). 
To derive this formula we set: 
X=Q,(P), Y=M,(P+9Q), 
Y=XZ. (41) 


and 


Differentiating (41) term by term, we find: 
adZ 
(P+Q)XZ = PXZ +X. 

Hence 

dZ_ o_ 

7 = X—1QXZ 

and since it follows from (41) that Z(t.) =£, 

Z =i, (X—1QX). 


When we substitute their respective matricants for X, Y, Z in (41), we 
obtain the formula 2. 


$ 
3. In | Q;,(P)| =f tr Pdr. 
te 


This formula follows from the Jacobi identity (4) (p. 114) when we 
substitute 21,(P) for X(t) in that identity. 


4, If A=|| ax ||i =const., then 
Qi, (A) = e4 —-#), 


We introduce the following notation. If P= || pu re then we shall 
mean by mod P the matrix | 


mod P= || |pa.| ||f. 


purthermore, if A = || au, || and B= || by, ||? are two real matrices and 


Ax S by (4,4=1,2,...,n), 
we shall write 
ep ASB. 


§ 5. THE MarricaNtT 129 


Then it follows from the representation (40) that: 
5. If mod P(t) S mod Q(t) (tty), then the series (40) for Q)(P) 
is majorized, beginning with the first term, by the same series for Q.(Q), 
so that forallt = ty 
mod 2, (P) < 2%,(Q), mod [Qf (P) — E] $ 21,(Q)— E, 
t t 


mod [.2;,(P) — B—[ Pdt] $2,(Q) -E—{Qdt, ete. 
& be 


In what follows we shall denote the matrix of order 7 in which all the 
elements are 1 by I: 


F=||1|]. 
We consider the function g(t) defined on p 126. Then we have 
mod P(t)  g(t)I. 


But 2,(g(t)I) is the normalized solution of the equation 
ax 


“= 9 (t) 7X. 
Therefore, by 4.,”° 
2 233 
Og) N= = H+ (HOt P+ TO 4..)7, ee 


where 
i 
h()=fg(t)de, g(t)=max |p_(t)|. 
& 134, kan 
Therefore it follows from 5. and (42) that: 


6. mod 2%, (P) < H+ — (e9—1) TZ, 
mod [24, (P) — HB] <—(e™ —1)1, 


| 
mod [%,(P) —E—f Pdr] <—(e™® —1—nh(t))1, ete. 
& 


We shall now derive an important formula giving an estimate for the 
modulus of the difference between two matricants : 


t 
23 By replacing the independent variable ¢ by h= f g(t) dt. 
t 


130 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


7. mod [Q%, (P) — Qi, (Q)] < — enatt—t) (e74(t— to) —I1)I (t= ty) : 
of 
modQ@ Sql, mod (P—Q) Sal, I={]1]| 


(q, d are non-negative numbers; ~ is the order of P and Q). 
We denote the difference P—@Q by D. Then 


P=Q-+D, mod D=d°I. 
Using the expansion (40) of the matricant in a series, we find: 


Qi, (Q + D) — Bi, (Q) 
¢ é ‘% © ae, BR ey <.# © 

=| D(r)de+ [ D(x) {Q(o)dodr + (Q(t) [ D(o) dade + [ D(r)[D(o)dode +---. 
be t te te & é ¢ 


From this expression it is ciear that, for t = fo, 


mod [.2, (Q + D) — 2%, (Q)] S 21, (mod Q + mod D) — 2%, (mod Q) 
<M, ((q + 2) 1) — Q, (qI) = ett 9 M4) — est et) 
= etl (tt) (et ¢—%) — £) 
== [z + — (eat-m))— 1) 1 (endtt—4) 1) J 


= = I+ < (erat) — 1) B] (edt) 1) 


=_ fa eralt—t) (era—4) i 1) I. 
” 


We shall now show how to express by means of the matricant the general 
solution of a system of linear differential equations with right-hand sides: 


= 2) Dy, (t) t+ 1,(0 (¢=1, 2,..., 7); (43) 


pu(t) and fi(t) (4, k=1, 2,..., ) are continuous functions of ¢ in some 
interval. 

By introducing the column matrices (‘vectors’) x = (21, Z2,..., %) and 
f= (fi, fe,..-, fn) and the square matrix P = } Pik {| , we write the system 
as follows: : 7 =. % 


“=P (e+ f(o. (43’ 


§ 6. THE MULTIPLICATIVE INTEGRAL. CALCULUS OF VOLTERRA = 131 
We shall look for a solution of this equation in the form 
z= Q;,(P)2z, (44) 


where 2 is au unknown column depending on ft. We substitute this expres- 
sion for x in (43’) and obtain: 


PQ, (P)2 + Qi, (P)Z = PQ,(P)z + td; 


hence 


& — [01,(P\ 7 (0. 


integrating this, we find: 
2=/ [QE(PI-f (2) de +6, 
to 
where c is an arbitrary constant vector. Substituting this expression in 
(44), we obtain: 
2= O,(P) [ LO,(P)-Y (x) de + Q(P) 6. (45) 
bo 


When we give to ¢ the value t., we find: x(t.) =c. Therefore (45) assumes 
the form 


x = O,(P) x (to) + { K (t,t) f(t) de, (45’) 


where 
K (t, t) = Q),(P) (21, (P)T 


is the so-called Cauchy matrix. 


§ 6. The Multiplicative Integral. The Infinitesimal Calculus 
of Volterra 


1. Let us consider the matricant Q)(P). We divide the basic interval 
(¢,.,¢) into parts by introducing intermediate points f,, fe. .... fa—1 and 
Set At, =t,—t,1 (K=1, 2,..., 2; t,=t). Then by property 1. of thie 
MIatricant (see the preceding section), 


Qi, = Qin 2+ QED. . (46) 


132 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


In the interval (¢,_,, ¢,) we choose an intermediate point t (k=1, 2,...,n). 
By regarding the 4¢;, as small quantities of the first order we can take, for 


the computation of QF , to within small quantities of the second order, 


P(t) ~ const.= P(t). Then 
Qi, =eP OW 4+ (oe) = + P(e,) At, + (#*); (47) 


here we denote by the symbol (##) the sum of terms beginning with terms 
of the second order. 
From (46) and (47) we find: 


Qh, =e? Fn) dtm 0. gP le) Ste P(e) 4h 4. (@) (48) 
and 


Qi, =[B + P(t,) At] ++ [HB + P (tq) Mtg] [B+ P(t) At) + (*). (49) 


When we pass to the limit by increasing the number of intervals indefi- 
nitely and letting the length of these intervals tend to zero (the small terms 
(*) disappear in the limit) ,?4 we obtain the exact limit formulas 


Q, (P) = lim [e? (nl4tn .. . oP ts) tae P (m1) 4h] (48’) 
and : Atz->0 


,(P) = jim [# 4- P(t,) At,] --> [H+ P(t_) At.) (2+ P(r,) At]. (49’ 


The expression under the limit sign on the right-hand side of the latter 
equation is the product wntegral.2> We shall call its limit the multiplicative 
antegral and denote it by the symbol 


fi [H+ P(t) dt] = lim [E+ P(t,) At,] --- [H+ P(r) At]. (50) 
4tz-0 3 


The formula (49’) gives a representation of the matricant in the form of a 
multiplicative integral 


O,(P)= [i (a +P dt), (61) 


and the formulas (48) and (49) may be used for the approximative compu- 
tation of the matricant. 


Sa 


“4 These arguments can be made more precise by an estimate of the terms we have 
denoted by (*). For a rigorous deduction of (48’) we have to use formula 7. of § 5 in 
which the matricant 9(t) must be replaced by a piece-wise constant matrix 


Q(t) = P(t) (haSstSt; k=1, 2,000, M). 


25 An analogue to the sum integral for the ordinary integral. 


§ 6. Tae MuLTIPLicaTive INTEGRAL. CALCULUS OF VOLTERRA 133 


The multiplicative integral was first introduced by Volterra in 1887. 
On the basis of this concept Volterra developed an original infinitesimal 
calculus for matrix functions (see [63] ).?¢ 

The whole peculiarity of the multiplicative integral is tied up with the 
fact that the various values of the matrix function P(t) in subintervals are 
not permutable. In the very special case when all these values are permutable 


P(t’) P@)=Pt’) P(t) (t’, t”” « (to, £)), 


the multiplicative integral, as is clear from (48’) and (51), reduces to the 
matrix 


| 
f Poa 


e” 


2. We now introduce the multiplicative derivative 
DX = xX. (52) 


The operations D; and f are mutually inverse: 
If ° 


DX =P, 
then?? 


X=fi(B+Pd)-C (C=X(h)), - 


and vice versa. The last formula can also be written as follows: 78 


~~ 


fi. (E+ Pde) =X () X (4). (53) 


We leave it to the reader to verify the following differential and integral 
formulas :?° 


26 The multiplicative integral (in German, Produkt-Integral) was used by Schlesinger 
in investigating systems of linear differential equations with analytic coefficients [49] 
and [50]; see also [321]. 

The multiplicative integral (50) exists not only for a function P(t) that is continuous 
in the interval of integration, but also under considerably more general conditions 
(see [116}). ) 

27 Here the urbitrary constant matrix C is an analogue to the arbitrary additive con- 
stant in the ordinary indefinite integral. 


aX 
28 An analogue to the ‘formula / Pdt = X(t) — X(t), where z= P, 
be 

29 These formulas can be deduced immediately from the definitions of the multiplica- 
tive derivative and multiplicative integral (see [63]}). However, the integral formulas are 
obtained more quickly and simply if the multiplicative integral is regarded as a matricant 
and the properties of the matricant that were expounded in the preceding section are used 
(see [49] ). 


134 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 
DIFFERENTIAL FORMULAS 


I. D,(XY) =D,(X) + XD,(Y) X>, 
D,(XC) =D,(X), 
D,(CY) =CD,(Y) C—. 

IL. D,(X7) = X* (D,X)" X™, 
IM. D,(X-1) =— XD, (X) X =—(D,(X"))", 
D,((X*)) = — (D, (¥))"- 


(C is a constant matrix) 


INTEGRAL FORMULAS 
IV. [(, (B+ Pdt)=f (B+ Pdr) [2 (E+ Par). 
v. fi (B+ Par)=[[* (B+ Par). 
VI. {' (2+ CPC— dt) =C [' (D+ Pdz) C— (C is a constant matrix) 
VIL. [', [B+ (Q+ DX) de] = X (t) ft, (B+ XOX de) X (ty. 


VII. mod| [*, (E+ Pdr) =f, (E+Qdz)| < — eng (tte) (end (th) __1) T (t> ty 


of 
mod QSql, mod(P—Q)SaI, I=|1| 


(q and d are non-negative numbers ; n is the order of P and Q). 
Suppose now that the matrices P and Q depend on the same parameter a 


P=P(t,2), Q=Q(t,@) 
and that 
lim P (t, «) = lim Q (t, 2) = Po (1), 
A-phe ‘ A~P>Xy 
where the limit is approached uniformly with respect to t in the interval 
(t.,¢) in question. Furthermore, let us assume that for a— a) the matrix 
Q(t, a) is bounded in modulus by q/, where q is a positive constant. Then, 
setting 
lim d(«)=0, 
A> ohy 
we have: Bs A 
d(a)= max | py (t, @)—ge(t,&)|. 
lsi,kgsn 
hats 


31 The formula VII can be regarded in a certain sense as a analogue to the formula for 
integration by parts in ordinary (non-multiplicative) integrals. VII follows from 2. 
of § 5). 


§ 7. DIFFERENTIAL SYSTEMS in COMPLEX DoMAIN 135 
Therefore it follows from formula VIII that: 
lim (fj, (B+ Pd)—f,, (+ Qat)] =O. 
In particular, if Q does not depend on a (Q(t, a) = P)(t)), we obtain: 
lim f! [E+ P(t,a) d)=f) [B+ Py (t) dt], 
a 


where ; 
Py (t) = lim P (6, @). 


§ 7. Differential Systems in a Complex Domain. General Properties 


1. We consider a system of differential equations 


d : | 
i = > Pit (z) x,. (54) 
k=l 
Here the given function py(z) and the unknown functions z(z) (t,k= 
1,2,...,”) are supposed to be single-valued analytic functions of a complex 
argument z, regular in a domain @ of the complex z-plane. 
Introducing the square matrix P(z) = 1 Dix (2) 2 and the column matrix 


xg == (21, Lo,..., In), We can write the system (54), as in the case of a real 
argument (§ 1), in the form 


« = P(z)2 (54’) 


Denoting an integral matrix, 1.e., a matrix whose columns are n linearly 
independent solutions of (54), by X, we can write instead of (54’) : 


<* =P (2) X (56) 


Jacobi’s formula holds also for a complex argument z: 


itr Pds 
| ».¢ | = ce*e (56 ) 


& 
Here it is assumed that 2 and all the points of the path along which / 1S 
Le 


taken are regular points for the single-valued analytic function tr P(z) = 
pi1(2) + poole) +..-+ Dan (2).°? 


—— 


32 Here, and in what follows, the path of integration is taken as a sectionally smooth 


curve. 


1386 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


2. A peculiar feature of the case of a complex argument is the fact that for 
a single-valved function P(z) the integral matrix X(z) may well be a many- 
valued functicn of z. 

As an example, we consider the Cauchy system 


X (U isa constant matrix). (57) 


One of the solutions of this system, as in the case of a real argument (see 
p. 115), is the integral matrix 


X =e7n(e—s) —(z—a)". (58) 
For the aomain G we take the whole z-plane except the point z=a. All the 
points of this domain are regular points of the coefficient matrix 


U 


Z2—a 


P (z)= 


If U <0, then z=a is a singular point (a pole of the first order) of the 
matrix function P(z) = U/(z—a). 

An element: of the integral matrix (58) after going around the point 
z =a once in the positive direction returns with a new value which is obtained 
from the old one by multiplication on the right by the constant matrix 


V = eztv 


In the general case of a system (55) we see, by the same reasoning as in 
the case of a real argument, that two single-valued solutions X and X are 
always connected in some part of the domain @ by the formula 


X=XC, 

where C is a constant matrix. This formula remains valid under any 
analytic continnation of the functions X(z) and X (z) in G. 

The proof of the theorem on the existence and (for given initial values) 
uniqueness of the solution of (54) is similar to that of the real case. 

Let us consider a simply-connected star domain G, (relative to z)* 
forming part of G and let the matrix function P(z) be regular** in G;. We 
form the series 


E+/{[Pja+ (Pe) [PCa d+--. (59) 
£e Le £e 


33 A domain‘is called a star domain relative to.a point 2 if every segment joining 20 
to an arbitrary point 2 of the domain lies entirely in the given domain. 


34 J.e., all the elements pu(e) (i,k =1, 2,...,”) of P(z) are regular functions in G), 


§ 7. DIFFERENTIAL SYSTEMS IN COMPLEX DoMAIN 137 


Since G, is simply-connected, it follows that every integral that occurs in 
(59) is independent of the path of integration and is a regular function in 
G,. Since G; is a star domain relative to 2, we may assume for the purpose 
of an. estimate of the moduli of these integrals that they are all taken along 
the straight-line segment joining 2 and z. 

That the series (59) converges absolutely and uniformly in every closed 
part of G,; containing z, follows from the convergence of the majorant 


1 ” 19 ni 
+ 1M + 5 PM* + = PM ++: 


Here M is an upper bound for the modulus of P(z) and J an upper bound 
for the distance of z from 2, and both bounds refer to the closed part of 
G, in question. 

By differentiating term by term we verify that the sum of the series 
(59) is a solution of (55). This solution is normalized, because for z= 2p 
it reduces to the unit matrix HE. The single-valued normalized solution of 
(55) will be called, as in the real case, a matricant and will be denoted by 
QP). Thus we have obtained a representation of the matricant in G, in 
the form of a series*® 


¢ é ¢ 
OA (PI=E+ (PCa + (PC) f[PCe ya’ a+---. (60) 
to te fo 


The properties 1.-4. of the matricant that were set up in § 5 automatically 
carry over to the case of a complex argument. 


Any solution of (55) that is regular in G and reduces to the matrix X, 
for z= 2 can be represented in the form 


X=Q(P)-C (C=X,). (61) 


The formula (61) comprises all single-valued solutions that are regular 
in a neighborhood of 2» (2 is a regular point of the coefficient matrix P(z) ). 
These solutions when continued analytically in G give all the solutions of 
(55) ; 1.e., the equation (55) cannot have any solutions for which z,. would 
be a singular point. 

For the analytic continuation of the matricant in G@ it is convenient to 
use the multiplicative integral. 


35 Our proof for the existence of a normalized solution and its representation in G; 
by the series (60) remains valid if instead of the assumption that the domain is a star 
domain we make a wider assumption, namely, that for every closed part of G, there exists 
@ positive number / such that every point 2 of this closed part can be joined to 2 by a path 
of length not exceeding l. 


138 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


§ 8. The Maltiplicative Integral in a Complex Domain 


1. The multiplicative integral along a curve in the complex plane is definec 
in the following way. 

Suppose that L is some path and P(z) a matrix function, continuous on 
L. We divide the path L into n parts (20, 21), (21,22), -.-, (Zn—1, 2n) ; here 
2 is the beginning, and z, =z the end of the path, and 2), 22, ..., 2,1 are 
intermediate points of division. On the segment 2;_12, ve take an arbitrary 


point ¢, and we use the notation dz,=2,— 2-1 (kK=1, 2,..., n). .We 
then define 

J [B + P(2)de]=lim(E + P(C,) 4z,] --- [B+ P(,) dy). 

L 2k 


When we compare this definition with that on p. 132, we see that they 
coincide in the special case where L is a segment of the real axis. However, 
even in the general case, where L is located anywhere in the complex plane, 
the new definition may be reduced to the old one by a change of the variable 
of integration. 

If 

=z (t) 


is a parametric equation of the path, where z(t) is a continuous function 


in the interval (t),t) with a piece-wise continuous derivative $* , then it is 
easy to see that 


fu + P(e) dz] =f {E+ Pte Gat. 


This formula shows that the multiplicative integral along an arbitrary 
path exists if the matrix P(z) under the integral sign is continuous along 
this path.*® 


2. The multiplicative derivative is defined by the previous formula 


DX =% x4. 


Here it is assumed that Y(z) is an analytic function. 

All the differential formulas (I- III ) of the preceding section carry over 
without change to the case of a complex argument. As regards the integral 
formulas IV-VI, their outward form has to be modified somewhat: 


36 See footnote 26. Even when P(z) is continuous along LZ. the function Pla(t))$ 
may only be sectionally continuous. In this case we can split the interval (t,¢) int 
partial intervals in each of which the derivative at is continuous and can interpret thi 
integral from t) to t as the sum of the integrals along these partial intervals. 


§ 8. MuLtipiicative InreqRaL IN COMPLEX DOMAIN 139 


~ 


Iv’. f (E+ Pdz)=/ (B+ Pdz) [ (e+ Pde. 
(7+ L/) Li L’ 


V’. f (H+ Pae)=[f (+ Pag. 
—L L 


VI’. fu + CPC— dz) =C fu + Pdz)C— (Cis a constant matrix). 
iL L 


In IV’ we have denoted by L’ + L’” the composite path that is obtained 
by traversing first L’ and then L’”. In V’, — L denotes the path that differs 
from LZ only in direction: 

The formula VII now assumes the form 


VIN’. { [E+ (Q + DX) de] =X (2) [ (+ XQX dz) X (29). 
L L 


Here X(z,) and X(z) on the right-hand side denote the values of X(z) at 
the beginning and at the end of LZ, respectively. 
Formula VIII is now replaced by the formula 


VII’. mod [/(# + Pdz)—[ (H+ Q de)| <— er (e!—1)1, 
L L 


where mod Q = q/, mod (P— Q) = a°-L,I= 1 1 i and | is the length of L. 
VIII’ is easily obtained from VIII if we make a change of variable in the 
latter and take as the new variable of integration the arc-length s along L 


(with |35| =1). 


3. As in the case of a real argument, there exists a close connection between 
the multiplicative integral and the matricant. 

Suppose that P(z) ig a single-valued analytic matrix function, regular 
in G, and that G, is a simply-connectéd domain containing 2) and forming 
part of G. Then the matricant 2;,(P) is a regular function of z in Gp. 

We join the points 2 and z by an arbitrary path L lying entirely in Go 
and we choose on LZ intermediate points 2;, 22,..., 2n—-1. Then, using the 
equation 


2 Qinns °° MAB: 


and proceeding to the limit exactly as in § 6 (p. 132), we obtain: 


140 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 
Q%,(P) = [ (E + P dz) = {7 (E+ Pdz). (62) 
L 


From this formula it is clear that the multiplicative integral depends not 
on the form of the path, but only on the initial point and the end point if 
the whole path of integration les in the simply-connected domain G, within 
which the integrand P(z) is regular. In particular, for a closed contour L 
in Go, we have: 


§ e+ P dz) =E. (63) 


This formula is an analogue to Cauchy’s well-known theorem acccrding 
to which the ordinary (non-multiplicative) integral along a closed contour 
is zero if the contour lies in a simply-connected domain within which the 
integrand is regular. 


4. The representation of the matricant in the form of the multiplicative 
integral (62) can be used for the analytic continuation of the matricant 
along an arbitrary path L in G. In this case the formula 


X = f* (B+ Pdz) X, (64) 


gives all those branches of the many-valued integral matrix X of the differ- 
‘ _ aX 
ential equation dz — PX that for z= 2, reduce to X_ on one of the branches. 


The various branches are obtained by taking account of the various paths 
joining 2 and z. 
By Jacobi’s formula (56) 


[tr P ds 
|X| =|Xo| e* 
and, in particular, for X¥,—= E, 
a 
~, f tr Pds 
fi, (B+ Pdz)| =e (65) 


From this formula it follows that the multiplicative integral is always 
a non-singular matrix provided only that the path of integration lies entirely 
in a domain in which P(z) is regular. 

If ZL is an arbitrary closed path in @ and @ is not a simply-connected 
domain, then (63) cannot hold. Moreover, the value of the integral 


§ 8. MuLTIPLicaTIvE INTEGRAL In CompLEX DoMAIN 14] 


f (E + P dz) 


is not determined by specification of the integrand and the closed path of 
integration L but also depends on the choice of the initial point of integra- 
tion z, on L. For let us take on the closed curve L two points 2) and 2; and 
let us denote the portions of the path from z, to 2; and from 2; to 2 (in the 
direction of integration) by LZ, and Le, respectively. Then, by the for- 
mula IV’,*’ 


and therefore a 
§-f-6-J* a 
Lr i, 


The formula (66) shows that the symbol f (EH + Pdz) determines a cer- 
tain matrix to within a similarity transformation, i.e., determines only the 
elementary divisor: of that matrix. 

We consider an element X(z) of the solution (64) in a neighborhood of 
Zo. Let L be an arbitrary closed path in G beginning and ending at z. After 
analytic continuation along L the element X(z) goes over into an element 
X(z). But the new element X(z) satisfies the same differential equation 
(55), sineé P(z) is a single-valued function in G. Therefore 


X=X?7, 


where V is a non-singular constant matrix. From (64) it follows that 
¥ (a) = h(E + Paz) X,. 
fo 
Comparing this equation with thé preceding one, we find: 
V=X:! $ (E + Pdz) X,. (67) 
In particular, for the matricant X =:,, we have X, = E, and then 


v=§ (EH + Pdz). (68) 


87 To simplify the notation we have omitted the expression to be integrated, EF + Pdz, 
which is the same for all the integrals. 


s 


142 XIV. APpPpuLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


§ 9. Isolated Singular Points 


1. We shall now deal with the behavior of a solution (an integral matrix) 
in a neighborhood of an isolated singular point a. 
Let the matrix function P(z) be regular for the values of z satisfying 
the inequality 
O0<|z—a|<R. 


The set of these values forms a doubly-connected domain G. The matrix 
function P(z) has in G an expansion in a Laurent series 


+ co 
P= DS P,(e—a)”. (69) 


An element X(z) of the integral matrix, after going once around @ in 
the positive direction along a path L, goes over into an element 


Xt (2) =X(2)V, 


where V is a constant non-singular matrix. 
Let J be the constant matrix that is conneeted with V by the relation 


V= e270, (70) 


Then the matrix function (e—a)Y after going around a along L goes 
over into (g—a)¥V. Therefore the matrix function 


F (z) = X (z) (e—a)—7 , (71) 


which is analytic in G, goes over into itself (remains unchanged) by analytic 
continuation along L.** Therefore the matrix function F(z) is regular in G 
and can be expanded in @ in a Laurent series 


-+ 00 
F(i)= > F,(z—a)”. (72) 


i= —90o 


From (71) it follows that: 
X (z) =F (z) (e—a)”. (73) 


Thus every integral matrix X(z) can be represented in the form (73), 
where the single-valued function F(z) and the constant matrix U depend on 


38 Hence it follows that when z traverses any other closed path in G, the function F(z) 
returns to its original value. 


§ 9. Isouatep SineuLar Points 143 


the coefficient matrix P(z). However, the algorithmic determination of U 
and of the coefficients F, in (72) from the coefficients P, in (69) is, in 
general, a complicated task. 

_A special case of the problem, where 


P(z) = P,,(z— a)" 


will be analyzed completely in §10. In this case, the point a is called a 
regular singularity of the system (55). 
If the expansion (69) has the form 


P (2) = = P,(2—a)" (q>1; P_,x*0) 

then a is called an irregular singularity of the type of a pole. Finally, if 
there is an infinity of non-zero matrix coefficients P, with negative powers. 
of z— a in (69), then a is called an essential singularity of the given differ- 
ential system. 

From (73) it follows that under an arbitrary single circuit in the posi- 
tive direction (along some closed path L) an integral matrix X(z) is multi- 
plied on the right by one and the same matrix 


V =e2nu 
If this circuit begins (and ends) at 2, then by (67) 


V=X(a) $e + Pdz) X(%).- (74) 


If instead of X(z) we consider any other integral matrix x (2) =X (z2)C 
(C is a constant matrix; | C | 40), then, as is clear from (74), V is replaced 
by the similar matrix 


V=CVC 


Thus, the ‘integral substitutions’ V of the given system form a class of 


similar matrices. 
From (74) it also follows that the integral 


‘s (H# + P dz) (75) 


is determined by the initial point 2) and does not depend on the form of the 


144 XIV. AppLicaTIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


curved path.*® If we change the point 2, then the various values of the 
integral that are so obtained are similar.*° 

These properties of the integral (75) can 
also be confirmed directly. For let Z and L’ 
be two closed paths in G around z =a with the 
initial points z) and 2. (see Fig. 6). 

The doubly-connected domain between L 
and L’ can be made simply-connected by intro- 
ducing the cut from 2 to 2’. The integral 
along the cut will be denoted by ati 


Zo 


rd i 
T = {> (E+ Paz). Fig. 6 


Since the multiplicative integral along a closed contour of a simply- 
connected domain is Z, we have 


[vr fit =s; 
ves 
hence _ _ 
jarym. 
LY L 


Thus, the integral (E+ Pdz), like V, is determined to within similarity, 
and we shall occasionally write (74) in the form 
V~ $ (EH + Pdz); 


meaning that the elementary divisors of the matrices on the left-hand and 
right-hand sides of the equation coincide. 


2. As an example, we consider a system with a regular singularity 


dX 
Px 
where 
Pa, ‘ 
P(2)=—t+ > P,,(z—a)”. 
. (m0 
Let - 
Q (2)= 


39 TJnder the condition, of course, that the path of integration goes around a once in 
the positive direction. 
40 This follows from (74), or from (66). 


§ 9. IsoLaTED SrnauLaR Points 145 


Using the formula VIII’ of the preceding section, we estimate the modulus 
of the difference 


D=g (+ Pae)—g (B+ Ode), (76) 


taking as path of integration a circle of radius r (r < &) in the positive 
direction. Then with 


mod P_,<p._,/, moa S'P n(z—a)”Sd(r) IZ, I=|1|; 


|z—a|=r n=O 
we set in VIII’: 
q =F, d=d(r), l=2zxr 
and then obtain 7 
mod D <—- e2*P-1 (e2mrd(r)__]) 7, 
Hence it is clear that*! 
lim D=O. (77) 


On the other hand, the system 
dY 
a QY 


is a Cauchy system, and in that case we have for an arbitrary choice of the 
initial point 2) and for every r< R 


§w + Qdz)=e2Pa, 


% 


Therefore it follows from (76) and (77) that: 


lim > (H + Pdz)=e2™P-1, (78) 
F-00's 


But the elementary divisors of the integral $ (£ + Pdz) do not depend on 


2 and r and coincide with those of the iuteeral substitution V. 

From this Volterra in his well-known memoir (see [374]) and his book 
[63] (pp. 117-120) deduces that the matrices V and e?”?-' are similar, so 
that the integral substitution V is determined to within similarity by the 
‘residue’ matrix P_. 

But this assertion of Volterra is incorrect. 


41 Here we have used the fact that for a suitable choice of d(r) 


lim d(r) = d, 
r->0 


where d, is the greatest of the moduli of the elements of Po. 


146 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


_ From (74) and (78) we can only deduce that the characteristic values 
of the integral substitution V coincide with those of the matriz e"*?—, How- 
ever, the elementary divisors of these matrices may he distinct. For example, 
for every r 5 0 the matrix ; 


| 


( 


eg 
0 «@ 


a 


has one elementary divisor (A —a)?, but the limit of the matrix for r— 0 
i.e, the matrix ele has two elementary divisors 4—a, A— a. 

Thus, Volterra’s assertion does not follow from (74) and (78). It is not 
even true in general, as the following example shows. 

Let — 


1 
z 


10° 0 
P@=|¢ = 


> all 


The corresponding system of differentia] equations has the form: 


a dz, _.—&% 
dz Ue dz 2° 
Integrating the system we find: 
2, =clnz +d, y= 
The integral matrix 
X(2) Inz 1 
CA me 
z Oo 


When the singular point z= 0 is encircled once in the positive direction, is 
multiplied on the right bv the matrix 
1 0 
Qni lil 


This matrix has one elementary divisor (A—1)?. At the same time the 
matrix 


eieoesidalle “Ne ls 


has two elementary divisors 4A—1, A—1. 


3. We now consider the case where the matrix P(z) has a finite number 
of negative powers of z—a (a is a regular or irregular singularity of the 
type of a pole) : 


§ 9. IsoLaten Sinauuak Points 147 


PQ) =qatg tote Pa L s'P, (z— a)” (q21; P_,*~9). 


: a)? r=Q 


‘We transform the given system 
dX 
qe Px (79) 


py aerene: X=A(2)Y, (80) 


where A(z) is a matrix function that is regular at z= 0 and assumes there 
the value #: 


A(z) = EF + A, (2—a) + Ag (z—@)? + 


the power series on the right-hand side converges for |z—a|< 1. 

The well-known American mathematician G. D. Birkhoff has published 
a theorem in 1913 (see [117]) according to which the transformation (80) 
can always be chosen such that the coefficient matrix of the transformed 
system 


= P*(z) Y (79’) 


contains only negative powers of z—a: 


P* (2) = pope 


wa z— o 


Birkhoff’s theorem with its complete proof is reproduced in the book 
Ordinary Differential Equations, by E. L. Ince.*? Moreover, on the basis 
of these ‘canonical’ systems (79’) he investigates the behavior of the solution 
of an arbitrary system in the neighborhood of a singular point. 

Nevertheless, Birkhoff’s proof contains am error, and the theorem 1s not 
true. AS a counter-example we can take the same example by which we 
have above refuted Volterra’s claim.*® 

In this example g=1, a=0 and 


0 0 0.1 
——_ | ’ Pyo= ,P,=0 for m=1, 2, ees io 
0 —1 0 0 


42 See [20], pp. 632-41. Birkhoff and Ince formulate the theorem for the singular 
point z= co. This is no restriction, because every singular point e—a can be carried by 
the transformation 2’ = 1/(¢— a) into 2’ = oo. 


43In the case g=1 the erroneous statement of Birkhoff coincides in essence with 
Volterra’s mistake (see p. 145). 


148 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


Applying Birkhoff’s theorem and a ee: in (79) the product 
Po 
Zz 


AY for X in (79), we obtain after replacing =~ and cancelling ¥': 


dz 
Equating the coefficients of 1/z and of the free terms we find: 

PE =P 5, A,P_,— P_,A,;+A,=Py. 
Setting 


A. = 
: c d 


0 | 0 Oo; |o 1 
0 —e. —d|| jlo Oj] 
This is a contradictory equation. 
In the following section we shall examine, for the case of a regular singu- 


larity, what canonical form the system (79) can be transformed into by 
means of a transformation (80). 


cal 


we obtain: 


810. Regular Singularities 


In studying the behavior of a solution in a neighborhood of a singular point 
we can assume without loss of generality that the singular point is z=0.* 


1. Let the given system be 


t= =P (2)X, (81) 
where 
Pi, 
P(z) => + 2) Pao (82) 
and the series z P,, 2” converges in the circle lel <r. 
mand 
We set 
X=A(z)Y, (83) 
where 
A (z) = EB + Aye + Agz* + °°. (84) 


44 By the transformation 2’ = z — a or 2’ =1/z every finite point s =a or s== 0 can 
be carried into 2’ = 0. 


§ 10. ReauLar SINGULARITIES 149 


Leaving aside for the time being the problem of convergence of the series 
(84), let us try to determine the matrix coefficients ‘A,, such that the trans- 
formed system : 


"- =P" (@)Y, (85) 
where 
Pt (2) = t=! 4+ SP (86) 
m=O 


is of the simplest possible. (‘canonical’) form.** 
When we substitute the product AY for X in (81) and use (85), we 
obtain : 


A (z) P*(z) Y ag oc r= P (2) A (2) Y. 


Multiplying both sides of the equation by Y~—! on the right we find : 


P (2) A(z) —A (2) P* (2) ie 


When we replace here P(z), A(z), and P*(z) by the series (82), (84), 
and (86) and equate the coefficients of equal powers of z on the two sides, 
we obtain an infinite system of matrix equations for the unknown coefficients 


Aj, Ag, Beckers 


1. P,=P*,, 
2. P_,A,—A,(P_1+ 2) + P.= Pr, 
3. P_,A,—Ao(P_1 +22) -+ PoA,—A\P9 + Py = Pi, 
(m+ 2). PliAmss — Ang, [Pa + (™ + 1) #) + 
+ PoAm--AmPo + PyAmis  -AmaPi tet t+ Pa= ue 


(87) 


2. We consider several cases separatei 


1. The matrix P_, does not have distinct characteristic values that 
differ from each other by an integer. 


ee 
45 We shall aim at having only a finite number (and indeed the smallest possible 
ee ®. 
aumber) of non-zero coefficients Pm in (86). 


46 In all the equations beginning with the second ve replace P-; by P«. in accordance 
with the first equation. 


150 XIV. APPLICATIONS TO SysTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


In this case the matrices P_, and P_,+kE do not have characteristic 
valnes in common for any k = 1, 2, 3,..., and therefore (see Vol. I, Chapter 
VIII, § 3)** the matrix equation 


P_,U—U(P_,+kE)=T 


has one and only one solution for an arbitrary right-hand side T. 
We shall denote this solution by 


®, (P_;,T). 


We can therefore set all the matrices P;, (m=0, 1, 2,...) in (87) equal to 
zero and determine A;, Ag, ... successively by,means of the equation 


\ 
A, = @, (P_,,— os A,= ®,(P_,,— P,— P,A,), ooee 
The transformed system is then a Cauchy system 


d¥_Pay 


dz 2z 
and so the solution X of the original system (81) is of the form*® 
X =A (z) zP-1, (88) 


2. Among the distinct churacteristic values of P_, there are some whose 
difference ts am integer ; furthermore, the matriz P_, is of simple structure. 

We denote the characteristic values of P_; by 41, do, ..., A, and order 
them in such a way that the inequalities 


Re (41) = Re (d2) 2... 2 Re (Ay) (89) 
hold. 


47 However, we can also prove this without referring to Chapter VIII. The proposi- 
tion in which we are interested is equivalent to the statement that the matrix equation 


PU =U(P-1+ kE) (*) 


has only the solution U =O. Since the matrices P-1 and P-1 + kE have no characteristic 
values in common, there exists a polynomial f(4) for which 


f(P.4)=0, f(Pa+kE) =E. 
But from (*) it follows that 
f(Pa)U=Uf(Pa+kE). | 
Hence U = O. | 
48 The formula (88) defines one integral matrix of the system (81). Every integra] 
matrix is obtained from (88) by multiplication on the right by an arbitrary constant 
non-singular matrix C. 


§ 10. ReEauLar SINGULARITIES 151 


Without loss of generality we can replace P_; by a similar matrix. This 
follows from the fact that when we multiply both sides of (81) on the left 
yy a non-singular matrix T and on the right by 7'—', we in fact replace all 
the P,, by TP,,T~1 (m=—1, 0, 1, 2, ...); ‘moreover, X is replaced by 
(XT-1. Therefore we may assume in this case that P_, is a diagonal 


matrix: 
P_» =| 4,54 ||?. (90) 


We introduce a notation for the elements of P,,, Px, and A,: 
Pa=|lre (lt, Pa= lee lt, Am = lle? lIP. (91) 


In order to determine A, we use the second equation in (87). This 
matrix equation can be replaced by the scalar equations 


(A; — —A,— 1) ap + py = pf?” (1, k =I, 2, Cea y n) (92) 


If none of the differences A; — A; is 1, we can set P3 =O. We then have 
from (872) that A; = @,(P_; = P,).* 
In that case the elements of A; are uniquely determined from (92) : 


(0) 
Pix ee 93 
ay = — 74 ((,&=1,2,..., m). (93) 
But if for some® 2, k 
A; a A, =1 3 
then the corresponding Pi is determined from (92) : 
Pa = Pe, 


and the corresponding Pon ) can be chosen quite arbitrarily. 
For those ¢ and k for “which Ay — Ax 1 We set: 


Pie =0, 


and find the corresponding a from (93). 
Having determined A;, we next determine Az from the third equation 
of (87). We replace this matrix equation by a system of n? scalar equations: 
(A; — A, — 2) xy = pp — py?) —(P. o4,—A,P oli (94) 
(3,4 =1,2,... , n). 


Here we proceed exactly as in the determination of Aj. 


49 ‘We use the rotation introduced in dealing with the case 1. 
50 By (89) this is only possible for ¢< k. 


152 XIV. AppPLicaTIONsS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


If 4, — Ay 2, then we set: 


pp” =0; 
and find from (94): 


1 ° 
n= aay es eT [pie — (Pedy — A, Poul. 
But if 4; — 4, = 2, then it follows from (94) for these 7 and k& that: 


pip =p + (Pod, — A, Poa- 


In this case a is chosen arbitrarily. 


Continuing this process we determine all the matrices P*,, Pj, Pj ,..-. 


and A, Ao, ... in succession. 


Furthermore, only a finite number of the matrices P,» is different from 


zero and, as is easy to see, P*(z) is of the form®™ 


r 
yg? 1. az’ 
A a 
Pty =||° = sts an? (95) 
8] 0 An 


where a= 0, when 4,— 4, is not a positive integer, and a,, = p< **~)” 


when /, — A; is a positive integer. 
We denote by m, the integral part of the numbers Re 4, :5? 


? 


m,=(Re(A)]  (¢=1, 2,..., 2). (96) 


Then, by (89), 


Mm, z=M,— aaa =m,. 
If A,—A. is an integer, then 


A, —A, =m,—m,. 


51 P® (m = 0) can be different from zero only when there exis characteristic values A, 
and Ax of P-: such that A, —A. —l—=~m (and, by (89), 1<k). For a given m there 
corresponds to each such equation an element p,,(™)* = a of the matrix P’.; this element 
may be different from zero. All the remaining elements of P%, are zero. 


52 T.e., ms is the largest integer not exceeding Red. (¢= 1, 2,..., ”). 


§ 10. Reauiar SincuLaRiTIES 153 


Therefore in the expression (95) for the canonical matrix P*(z) we can 
replace all the differences 4, — A, by m,— m,. Furthermore, we set: 


4,=A—m, (i=1,2,...,2), (91’) 
Ay A12 +++ F1n 
0 pare ae 

M=l\mdgit, O= [0 27 T, (97) 


j9 O ... A, 
Then it follows from (95) (see formula I on p. 134): 
P* (2) =e 2 4 FD; (Me). 


Hente Y = z2¥z¥ is a solution of (85) and 
X =A (z) 2¥%27 (98) 


is a solution of (81). 

3. The general case. As we have explained above, we may replace P_; 
without loss of generality by an arbitrary similar matrix. We shall assume 
that P_; has the Jordan normal form™ 


P_, ={A,H,+ Hj, Aj, + Ho, o 6 ey AE + H,,}, (99) 
with 
Re (A,) 2 Re (A,) = ++: = Re(A,). (100) 


Here £Z denotes the unit matrix and H the matrix in which the elements of 
the first superdiagonal are 1 and all the remaining elements zero. The orders 
of the matrices EZ, and H, in distinct diagonal blocks are, in general, differ- 
ent; their orders coincide with the degrees of the corresponding elementary 
divisors of P_1." 

In accordance with the representation (99) of P_; we split all the 
matrices Pm, Pm, A» into blocks: 


88 The special form of the matrices (97) corresponds to the canonical form of P-1. 
If P.. does not have the canonical form, then the matrices M and U in (98) are similar to 
the matrices (97). 

54 See Vol. I, Chapter VI, § 6. 

55 To simpiify the notation, the index that indicutes the order of the matrices is omitted 
from or end il «. 


154 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 
Pa=(PEMN, Pr= (PL VN, Am= (XP. 


Then the second of the equations (87) mav be replaced by a system of 
equations 


(A,B, + H) XY -- XD (A, +) £,+ H,) + PO= Pos (101) 
which can also be written as follows: 
(A, —A,—1) XP + H,XDP— XP, + PP=PLY" (Gt, R=1,2,..., u). (102) 


-‘Suppose that** 


Dig Das es 
x oH se 
xp |e = all, Pl ee |, Palle I 


Then the matrix equation (102) (for fixed 1 and k) can be replaced by a 
system of scalar equations of the form*’ 


(Ay— AyD) tap + apt, Stat Pe =P (103) 
(61 2s nag OS FH 2 Os Wf SF. 9 = 0) 


where v and w are the orders of the matrices 1,E, + H,and 4, Ey, + H;,in (99). 
If A, — A, 1, then in (103) we can set all the p,,* equal to zero and 


determine all the x,; uniquely from the recurrence relations (103). This 
means that in the matrix equations (102) we set 
po” = O 
and determine Xi) uniquely. 
If 4, — A, =1, then the relations (103) assume the form 
®t, rit Pe = Pa (104) 
(%ot1,2= %, 9 =0; ¢ =1,2,..., 0; #=1,2,..., w). 


56 18 simplify the notation, we omit the indices 1, % in the elements of the matrice 
Xi; Pi?) Pi? : 

57 The reader should bear in mind the properties of the matrix H that were developed 
on pp. 13-15 of Vol. I. 


§ 10. Rea@uLar SINGULARITIES 155 


It is not difficult to show that the elements z,; of X‘) can be determined 
from (104) so that the matrix Py “has, depending on its dimensions (v X w), 
one of the forms 


|a@ 0 oe 0 a 0 ~ «+ - 00...90 

— ters 0 a, ~-+- 00...0 

Qo1 G2 --- G ’ a a9...90 
(v=w) (v< w) 


a) 
oo: 
a) 


a a es (105) 


Qy1 +--+ Gy A 
(v> w) 


We shall say of the matrices (105) that they have the regular lower 
triangular form.® 

From the third of the equations (87) we can determine Az. This equa- 
tion can be replaced by the system 


(A; — A, — 2) xy + HX) — XP H, + { Pod, — A,Po Ju + Py) =P, (106) 
(i, k=1, 2, ..., w). 


In the same way that we determine A;, we determine Xe uniquely with 
pug = O from (106) provided 4, — A, +42. Butif 4,— A, = 2, then Xi? can 
be determined so that PY* is of regular lower triangular form. 

58 Regular upper triangular matrices are defined similarly. The elements of x) are 
not all uniquely determined from (104); there is a certain degree of arbitrariness in the 
choice of the elements x. This is immediately clear from (102): fur A, —Ax==1 we 
may add to Xi an arbitrary matrix permutable with Z, i.e., an arbitrary regular upper 
triangular matrix. 


156 XIV. Appuications ro Systems ce LINEAR DIFFERENTIAL EQUATIONS 


Continuing this process, we determine all the coefficient métrices Aj, 
Ao... and P*,, P}. P{,...in succession. Only a finite number of the coef- 
ficients FP is different from zero, and the matrix P*(z) has the following 
block form :*° 


1,#, +H, . 7 
1 7 1 By q2** Ag—l re Bz 


AE, + H, 1 
P* (2) = O cig a! heise Bo , (107) 
O OF see 
where 
O if 4,— A, is not a positive integer, 


By = PU-*-U* if 1,— A, is a positive integer. 


All the matrices By, (1,4 =1, 2,..., wu; t1<k) are of regular lower trian- 
gular form. 
As in the preceding case, we denote by m, the integral part of Re 4, 


m,= [Re (A,)} (#=1,2,..., u) (108) 
and we set 


A=m, + 4, (§=1,2,..., 4). (108’) 


Then in the expression (107) for P*(z) we may again replace the difference 
Ay-— Ay everywhere by m,—m,. If we intruduce the diagonal matrix M 
with integer elements and the upper triangular matrix U by means of the 
equations® 


M=(mE6y);, U= id Agi, + Ha - Bou , (109) 
a) O ... ABL+H 


then we easily obtain, starting from (107), the following representation of 
P*(z): | 
U M 
P*(z) = 2% —- eM + oa = D, (z¥z"). 
tae os eens 
59 The dimensions of the square matrices E,, H; and the rectangular matrices By are 


determined by the dimensions of the diagonal blocks in the Jordan matrix P-1, i.e., by 
the degrees of the elementary divisors of P-:. 


60 Here the splitting into blocks corresponds to that of P_-: and P*(z). 


§ 10. ReauLar SINGULARITIES 157 
Hence it foliows that the solution (85) can be given in the form 
al ty 
and the solution of (81) can be represented as follows: 
X = A(z) 2¥2". (110) 


Here A(z) is the matrix series (84), M is a constant diagonal matrix whose 
elements are integers, and U is a constant triangular matrix. The matrices 
M and U are defined by (108), (108’), and (109) * 


3. We now proceed to prove the convergence of the series 
A(z) =E + Ayzt+ Agztteee. 
We shall use a lemma which is of independent interest. 


Lemma: If the series® 
© =Gy + aye + agz?7+>-> (111) 
formally satisfies the system 
dz 
a =P (z)2 (112) 


for which z=0 1s a regular singularity, then (111) converges wm every neigh- 
borhood of 2=0 in which the expansion of the coefficient.matriz P(z) im 
the serves (82) converges. 

Proof. Let us suppose that 


P(2z)= + YP, 
q=0 


where the series 2 Pz, converges for |z|< r. Then there exist posi- 


tive constants p—; and p such that® 
mod P_,SpJ, mod P,s2J, I=||1||  (m=0,1,2,...). (113) 


Substituting the series (111) for z in (112) and comparing the coeffi- 
cients of like powers on both sides of (112), we obtain an infinite system of 
(column) vector equations 


61 See footnote 53. 


62 Here 2 = (1, Za, ..., In) is a column of unknown functions; @&, dh, a2, ... are con- 
stant columns; P(z) is a square coefficient matrix. 


63 For the definition of the modulus of a matrix, see p. 128. 


158 XIV. AppLicaTIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


(H— P_,) a= Pop, 
(2 H— P_,) a= junit 


| (114) 
(mB — P_,) an 2 es + — —2? aig ++ Pn—1%: 
It is sufficient to prove that every remainder of the series (111) 
x) =a,2* + Gy, 32th + eee (115) 


converges in a neighborhood of z=0. The number k is subject to the 
inequality 
k > npr. 


Then k exceeds the moduli of all the characteristic values of P—1,°* so that 
for m = k we have | mE — P_, | 0 and 


i P_ 
(mE — P_)-* =~ (E—-=!)~ = n+ SP ytoePtite: (116) 


(m= k,k +1, a 


In the last part of this equation there is a convergent matrix series. With 
the help of this series and by using (114), we can express all the coefficients 
of (115) in terms of dp, ai, ..., @e—1 by means of the recurrence relations 


t= (2 E + = ay aor + pt i ad . (fm—1 + Potm—1 +++ + Pr_p_r%), 


(m=k,k+1,...) (117) 
where 
| OE m—keK—1 eee Pin—1% (m= k, k + 1, ee .). (118) 


Note that this series (115) formally satisfies the differential equation 


64 Jf \) is a characteristic value of A = |lac ||}, then | |S 2*max|au|. For let 


Ax == hot, where & = (ti, &, vee y Sn) 3&0, Then latkaon 


a 
Ay; =)’ Py oF (s=1, 2, eee n). 
k=l 


Let | 2) |==max {| |, | 2|,..., | a}. Then 
nm 
[4o| [21S Sd" lap [ze] S| zj|n max |ay|. 
k=l isi,kgn 


Dividing through | 2/ |, we obtain the required inequalitv. 


§ 10. ReeuLar SINGULARITIES 159 


dak) 


—— P(z) x) + f (z), (119) 
where 
f= DY fp = Plz) (ag + a2 + 20 + yz) — 
m=ak—1 
— @, — 2 agz—++>—(k—1) a,_,2?* (120) 


From (120) it follows that the series 


SD fy” 


Mmeak—1 


converges for | z| < 17; hence there exists an integer N > 0 such that® 
NI- | 
mod fy ||| m=k—1,b,...).. (121) 


From the form of the recurrence relations (117) it follows that when the 
matrices P_;, P,, fm—1; in these relations are replaced by the majorant 
| a3 i and the column dm by ] Om | (m=k, k +1, 

.;q=0, 1, 2,...),°* then we obtain relations that determine upper bounds 
| Qn | for mod dm: 


matrices p_il, pr—4, 


mod a,, S || @,, || - (122) 
Therefore the series 


EE) = az + a, ttl 4 o0. - (123) 


after term-by-term multiplication with the column | 1 | becomes a majorant 
series for (115). 
By replacing in (119) the matrix coefficients P_1, P,, fm of the series 


P()=*=4 5 Pye, { (2) = Dy 2” 
g=0 mak—} 


by the corresponding majorant matrices g_, I, F I, | = |, || € ||, we obtain 
a differential equation for §“) : 
gk-1 
ag) Px Pp ; k rk -1 
Ge ee ae (124) 
- 


65 Here || N/rm || denotes the column in which all the elements are equal to one and the 
same number, N/r™. 
66 Here || am || denotes the column (am, am, ..., am) (am is a constant, m==k, 


k+1,...). 


160 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


This linear differential equation has the particular solution 


ND npr—l 
gi a / ze-mat(1— de, (125) 
0 


~ ea G —2er- 


which is regular for z=G and can be expanded in a neighborhood of this 
point in the power series (123) which is convergent for | z| <r. 

From the convergence of the majorant series (123) it follows that the 
series (115) is convergent for | z| < r, and the lemma is proved. 


Note 1. This proof enables us to determine all the solutions of the differ- 
ential system (112) that are regular at the singular point, pre ided such 
solutions exist. 


For the existence of regular solutions (not identically ze-o, rt 1s necessary 
and suf ficient that the residue matrix P_, have a non-negatie untegral char- 
acteristic value. If sis the greatest integral characteri: ‘ic value, then columns 
Qo, @y,..., @, that do not all vanish can be dei.“mined from the first s + 1 of 
the equations (114) ; for the determinent of «1e corresponding linear homo- 
geneous equation is zero: 


A=|P_,||E—P_,|---|s#—P_,|=0. 


From the remaining equations of (114) the columns a,41, @,42, ... can be 
expressed uniquely in terms of dp, a1,..., @,. The series (111) so obtained 
converges, by the lemma. Thus, the linearly independent solutions of the 
first s + 1 equations (114) determine all the linearly independent solutions 
of the system (112) that are regular at the singular point z=0. 

if z=0 is a singular point, then a regular solution (111) at that point 
(if such a solution exists) is not uniquely determined when the initial] value 
@. is given. However, a solution that is regular at a regular singularity is 
untruecy determined when @p, di, ..., @ are given, 1.e., when the initial 
values at z= 0 of this solution and the initial values of its first s derivatives 
ste viven (s is the largest non-negative integral characteristic value of the 
residue matrix P_;). 

Note 2. The proof of the lemma remains valid for P_1;=Q. In this 
case an arbitrary positive number can be chosen for p_, in the proof of the 
lemma. For P_,=O the lemma states the well-known proposition on the 
existeace of a regular solution in a neighborhood of a regular point of the 
system. In this case the solution is uniquely determined when the initial 
value d 1S given. . 


4. Suppose given the system 
= P()X, (126) 


§ 10. Reauuar SINGULARITIES 161 


where 


P(2)= "2+ 2 Pm” 


and the series on the right-hand side converges for | z| < r. 
Suppose, further, that by setting 


X =A (z)Y (127) 
and substituting for A(z) the series 
A (z)= Ag + Ayz + Agz? + coo , (128) 


we obtain after formal transformations: 


as = P* (2) ¥, (129) 
where 


P* (2) =" 4 SP, 
jvm 


and that here, as in the expression for P(z), the series on the right-hand 
side converges for | z2| <r. 
We shall show that the series (128) also converges in the neighborhood 
|z| <r of z=0. 
Indeed, it follows from (126), (127), and (129) that the series (128) 
formally satisfies the following matrix differential equation 
of =P (2) A—AP*(2). (130) 
We shall regard A as a vector (column) in the space of all matrices of 
order n, i.e., a space of dimension 7”. If in this space a linear operator 
P(2) on A, depending analytically on a parameter z, is defined by the 
equation 
P(z) [A] =P(z) A—AP*(2), (131) 


then the differential equation (130) can be written in the form 
dA = : 
EP (2) [A]. (132) 


The right-hand side of this equation can be considered as the product of 
the matrix « (z) of order n? and the column A of n? elements. From (131) 
it is clear that z= 0 is a regular singularity of the system (132). The series 
(128) formally satisfies this system. Therefore, by applying the lemma, we 
conclude that (128) converges in the neighborhood | z| <r of z=0. 


162 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 
In particular, the series for A(z) in (110) also converges. 
Thus, we have proved the following theorem : 


THEOREM 2. Every system 
aX 
qn PX, (133) 


with a regular singularity at z=0 


P(2) ==! 4S" P,2”, 
mea 0 


has a solution of the form 
X=A(z)z¥%z , (134) 


where A(z) ts a matrix function that 1s regular for z=0 and becomes the 
unt matrix E at that point, and where M and U are constant matrices, M 
being of simple structure and having integral characteristic values, whereas 
the difference between any two distinct characteristic values of U 1s not an 
integer. 

If the matrix P_, is reduced to the Jordan form by means of a non- 
singular matric T 


P_»=T{A,E, + Hy A,B, + Hy, ..., 4,2, + H,} T (135) 
(Re (A) = Re (A,) =---2 Re(A,)), 


then M and U can be chosen in the form 


M=T (m,E,, mgH3,...,m,E,} T, (136) 
4, EB, +H, - | : re : 
U=T O AgHz+H,... Be, T, (137) 
Oo O ... Af, +H, 
where 
m=[A], A=A—m (t=1,2,...,8). (138) 
The By, are regular lower triangular matrices (i,k =I, 2,...,8) and By, =O 


of 4; — A, is not a positive integer (1,k=1,2,...,8). 

In the particular case where none of the differences 4,— A, (t,k =1, 2, 
3,...,8) is a positive integer, we can set in (184) M=O and U=P_,; te., 
om this case the solution can be represented in the form 


X =A (z) z?-1, (139) 


§ 10. Reauiiar SINGULARITIES 163 
Note 1. We wish to point out that in this section we have developed an 


algorithm to determine the coefficients of the series A(z) = = A,,2” 


(A, = E) in terms of the coefficients P,, of the series for P(z). Moreover, 
the theorem also determines the integral substitution V by which the solution 
(134) is multiplied when a circuit is made once in the positive direction 
around the singular point z=0: 


V = e270 | 
Note 2. From the enunciation of the theorem it follows that 
Byu=O for AAA, (i, K=1,2,...,8). 


Therefore the matrices 


O By B,, 
A=T{A,By, 4,B,...,4B}T ana T=7[? 9 ++ 3] (140) 
00 O 
are permutable: 
AU=UA. 
Hence a ope : 
ZMZU = zMg At U — gMZAg — 242V, (141) 
where a 
A=M+A=T{A,, Ay, ...,4,} TO (142) 
and where 4, j2, ..., A, are all the characteristic values of P_, arranged 
in the order Re A; = Red, =... = Red. 
On the other hand, 
27=h(U), 


where h(A) is the Lagrange-Sylvester interpolation polynomial for f(1) = 24. 

Since all the characteristic values of U are zero, h(A) depends linearly 
on f(0), f’(0), ..., f99-(0), ie, on 1, Inz,..., (Inz)9~? (g is the least 
exponent for which 07 =0) Therefore 


g—-1 
h(A)= = h, (A) (In 2)! 
and 
1 4g +++ Gin 


a a OL ... Gee bans 
22=h(U) = D'h,(U) (Inzv=T Tt, (148) 
j=mO 


164 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EquaTIoNns 


where qj (j= 1, 2,...,%;%< Jj) are polynomials in ln z of degree less 


than g. 
By (134), (141), (142), and (243) a particular solution of (126) can be 


chosen in the form 


zy 0 ...0 1] Qe +++ Gin 
X=A(2) 0 2...0 O 1 ..- Gen . (144) 

Cc O...2 (0 O ...1 | 
Here A;, do, ..., An are the characteristic values of P_, arranged in the order 
Red, = Re =...2Rer, and gy (t,j=1, 2,..., m; t< J) are poly- 


nomials in In z of degree not higher than g — 1, where g is the maximal num- 
ber of characteristic values 4, that differ from each other by an integer ; A(z) 
is a matrix function, regular at z=0, and A(0) =T (|T|~0). If P_1, has 
the Jordan form, then T= E. 


§ 11. Reducible Analytic Systems 


1. Asan application of the theorem of the preceding section we shall investi- 
gate in what cases the system 


=X, (145) 
where 
_ —Qm 
QW= Sm (146) 


is a convergent series for t > to, is reducible (in the sense of Lyapunov), 1.e., 
in what cases the system has a solution of the form 


X=Li(t)e*, (147) 


where L(t) is a Lyapunov matrix (i.e., L(#) satisfies the conditions 1.-3. on 
p. 117) and B is a constant matrix.” Here X and Q are matrices with 
complex elements and ¢ is a real variable. 

We make the transformation 


imma ae 


67 If the equation (147) holds, then the Lyapunov transformation X = L(t)¥Y carries 
the system (145) into the system < = BY : 


§ 11. Repuctsue ANALYTIC SYSTEMS 165 


Then the system (145) assumes the form 


a = P(z) X, (148) 


where 
P(2)=—2-2Q(2)=—%_ . 149 
(2)=—2-7Q(5)=—F— J Omer?” (149) 
meaQ 
The series on the right-hand side of the expression for P(z) converges for 
|z| <1/to. Two cases can arise: 


1) Q1=0. In that case z = 0 is not a singular point of the system (148). 
The system has a solution that is regular and normalized atz=0. This solu- 
tion is given by a convergent power series 


X(2)= B+ X,24 Xyztt (a <<), 
Setting 
1 


L()=X(z), B=0, 


we obtain the required representation (147). The system is reducible. 


2) Q:1 #0. In that case the system (148) has a regular singularity at 
2=0. 

Without loss of generality we may assume that the residue matrix 
P_,=— Q; is reduced to the Jordan form in which the diagonal elements 
Ai, Ao, .--, dn AYe arranged in the order Re j, = Red, 2... = Re Ay. 

Then in (144) 7 = E£, and therefore the system (148) has the solution 


za 0 ...0 lL @ig --+ Qin 
0 z...0 O 1 ..+ Gem 


eo ee ee ee 8 @ Ff Ff ie #@  @  e ee  @ 


where,the function A(z) is regular for z2=0O and assumes at this point the 
value H, and where gy, (t,k=1, 2,.. ,;%<k) are polynomials in In z. 
When we replace z by 1/t, we have: 


Gf a 0 | Yt aaling) sina) 
X=A(>) 0 (+)" 0 lo ] don (In = (160) 


166 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 


Since X = A(1/t)Y is a Lyapunov transformation, the system (145) is 
reducible to a system with constant coefficients if and only if the product 


1 


| 1 
‘tt 0... Of fil diz(In-) --- %ia(In 7) 


| 

| 

a 

L,()= 0 {-4s fish SO OQ: l suet ar 


eo 8 @ @ @ @ @ © 4, Fp  e& @ ¢ ee ee © 


where B is a constant matrix, is a Lyapunov matrix, 1.e., when the matrices 


Os and L{1(t) are bounded. It follows from the theorem of Erugin 


(§ 4) that the matrix B can be assumed here to have real characteristic 
values. 

Since £,(t) and Ly (t) are bounded for t > fo, all the characteristic 
values of B must be zero. This follows from the expression for e®' and e~ 
obtained from (151). Moreover, all the numbers 4;, do, ... , 4, must be pure 
imaginary, because by (151) the fact that the elements of the last row of 
I,(t) and of the first column of L7*(t) are bounded implies that Re 2, = 0 
and Re Ax = 0. 

But if all the characteristic values of P_, are pure imaginary, then the 
difference between any two distinct characteristic values of P_, cannot be 
an integer. Therefore the formula (139) holds 


X=A(2) 2? =A(>)M, 


and for the reducibility of the system it is necessary and sufficient that 
the matrix 


Ly (t) = t%e— 38 (152) 
together with its inverse be bounded for t > to. 


Since all the characteristic values of B must be zero, the minimal poly- 
nomial of B is of the form 47: We denote by 


yp (A) = (A— fy) (A — fg) ++ (A— py) (M4 A uy for 1k) 


the minimal polynomial of Q;. As Q;=— P_,, the numbers py, po, ..., py 
differ only in sign from the corresponding numbers 4, and are therefore all 
pure imaginary. Then (see the formulas (12), (13) on p. 116) 


§ 11. REDUCIBLE ANALYTIC SYSTEMS 167 


= DS) [Uy + Uy int +++ + Uy 3 (In tye") te, (153) 
kal 


eH —Vyt+ Vite + VA. (154) 
Substituting these expressions in the equation 
L, (t) e?# =, 
we obtain 


[Ly (t) Va_y + (#)] O 7 = Zo (t) (In), (155) 


where c is the greatest of the numbers c1, Ce, ..., Cn, (*) denotes a matrix 
that tends to zero for t—> o, and Z,(t) is a bounded matrix for t > fo. 
Since the matrices on both sides of (155) must be of equal order of magni- 
tude for t—» oo, we have 
d=c=1, 
1.e.,: 
B=0, 


and the matrix Q; has simple elementary divisors. 
Conversely, if Q; has simple elementary divisors and pure imaginary 
characteristic values 1, w2,..-, pn, then 


X=A (2) 2-%=A (2) |/27-% 4, |) 
is a solution of (149). Setting z=1/t, we find: 


X=A(;)Ile"Salf. 


The function X(t) as well as oe 


bounded for t > to. Therefore the system is reducible (B=-O). Thus we 
have proved the following theorem :® 


and the inverse matrix X—1(¢) are 


THEOREM 3: The systen. 
dX 
where the matrix Q(t) can be represented im a series convergent for t > t, 
Q(p=H 444 ore, 


as reducible tf and only if all the elementary divisors of the residue matrix 
Q: are simple and all its characteristic values pure imaginary. 


68 See Erugin [13]. The theorem is proved for the case where Q: does not have dis- 
tinct characteristic values that differ from each other by an integer. 


168 XIV. AppLicaTIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATICNS 


§ 12. Analytic Functions of Several Matrices and their 
Application to the Investigation of Differential Systems. 
The Papers of Lappo-Danilevskii 


1. An analytic function of m matrices X,, Xo,..., Xm of order n can be 
given by a series 
(1,.-+.m) 
F(X, Xs, 7? 4 m) = %q a z ; Pe Biri. jv Xj, Xj, °° “Xi, (156) 


convergent for all matrices X; of order n that satisfy the inequality 


mod X;< R, (j= 1, 2,..., m). (157) 


Here the coefficients 


or Xityeente (Frs Jar ees Jo 1, 2,...,m; V= I, 2, 3, ...) 


are complex numbers, R; (j= 1, 2,..., m) are constant matrices of order n 
with positive elements, and X; (j=1, 2,..., m) are permutable matrices 
of the same order with complex elements. 

The theory of analytic functions of several matrices was developed by 
I. A. Lappo-Danilevskii. He used this theory as a basis for fundamental 
investigations on systems of linear differential equations with rational 
coefficients. 

A system with rational coefficients can always be reduced t:. the form 


m U; U; U — 
- a i a ee ac xX (158) 
2 jai l@—a)’ (@—a,)! arnt 


after a suitable transformation of the independent. variable, where U;, are 
constant matrices of order n, a; are complex numbers, and s; are positive 
integers (k=0, 1, ..., 8-1; j= 1, 2,..., m).™ 

We shall illustrate some of Lappo-Danilevskii’s results in the special case 
of the so-called revular systems. The latter are characterized by the condi- 
tion Ss; = s2 =... =< Sm = 1 and can be written in the form 


i a Saat (159) 


68 In the system (158) all the coefficients are regular rational fractions in zg. Arbi- 
trary rational coefficients can be reduced to this form by carrying a finite point z—=c 
that is regular (for all cocfficients) by means of a fractional linear transformation on 2 
into =o. 


§ 12. AnsaLyTic, FUNCTIONS OF SEVERAL MaTRICES AND APPLICATIONS 169 


Following Lappo-Danilevskii, we introduce special analytic functions, 
namely hyperlogarithms, which are defined by the following recurrence 
relations: 


ly (Zz; @;,) =| 


ly (Z; @j,, Qj, «+ +5 By) =e settee a = a 
1 

Regarding a1, do, ..., Gm, © as branch points of logarithmic type, we 
construct the corresponding Riemann surface S(ai, d2,...,@m; 0). Every 
hyperlogarithm is a single-valued function on this surface. On the other 
hand, the matricant 2, of the system (159) (i.e., the solution normalized at 
z=b) after analytic continuation can also be regarded as a single-valued 
function on S (a1, do,..., Gm; 0); here b can be chosen as an arbitrary finite 
point on § other than aj, ae, ..., dm. 

For the normalized solution 2; Lappo-Danilevskil gives an explicit ex- 
pression in terms of the defining matrices U,, U2, ..., Um of (159) in the 
form of a series 


ec (i1,. 


=E+ » 3" »(2} Qj, Bj, .- +, Aj) Uj, Uj, +++ Uy. (160) 


v= 1 div sees 


This expansion converges uniformly in z for arbitrary U1, Uo,..., Um and 
represents 92; in any finite domain on S(4a, a2, ..., Gm; ©) provided only 
that the domain does not contain a, dz, ..., @m in the interior or on the 
boundary. 

If the series (156) converges for arbitrary matrices X,, X2,..., Xm, then 
the corresponding function F(X1, Xe, ..., Xm) is called entire. Q; is an 
entire function of the matrices U;, Uo,... Um. 

If in (160) we let the argument z go afound the point a, once in the posi- 
tive direction along a contour that does not enclose other points a, (for 
41> j), then we obtain the expression for the integral substitution V; corre- 
sponding to the point 2=a,: 


eco (1,.. 


V; =F + a = - (b; a Dy, Digr very aj;,) U;,U;, ... U;, (161) 


Vel fis woe 


G=12, vee Mm) 


where in a readily understandable notation 


170 XIV. APppuLicaTIONs 'ro SYstEMS Or LINEAR DiFFERENTIAL EQUATIONS 


dz 
pj (b; a) =| a," 
(a7) 
ly (25 Djgr Bjyy + oy Byy) e 


Pi (b; a5, Qj, eee a:;,) aa zZ— aj, 


(a4) 
jy dar +9 in GH 12, oo " 
y—1,2,3,... 


The series (161), like (160), is an entire function of U;, Uo,..., Um. 


2. Generalis ng the theory of analytic functions to the case” of a countably 
infinite set of matrix arguments X,, X2, Xs, ..., Lappo-Danilevskii has 
used it to study the behavior of a solution of a system in a neighborhood 
of an irregular singularity.71 We quote the basic result. 


The normalized solution Q; of the system 


ax ; 
=—¢ 


where the power series on the right-hand side converges for jel<r(r>1),” 
can be represented by a series 


ai=B+ SS Pye Py 
veel Tis Tas sons ym —Y (162). 


y : R—ps 3 
foprticss tht ee ht: thytu # a () 2 Ine 
pm = x= . 


Here ar S 1.4, and a” ;, are scalar coefficients that are defined by 
7 1’ Ce ry 78 es 
special formulas. The series (162) converges for arbitrary matrices P,, Po, 


... in an annulus 


e<|zl<r 


(9 is any positive number less than r). The point 6 must also lie in this 
annulus (@<|b| <r). i 


re 


70 See [29], Vol. I, Memoir 1. 
71 See [29], Vol. I, Memoir 3. See also [252], [253], [254], [146], and [147]. 


72 The restriction 7 > 1 is not essential, since this condition can always be obtained by 
replacing 2 by az, where a is a suitably chosen positive number. 


§ 12. ANauytic FUNCTIONS OF SEVERAL MATRICES AND APPLICATIONS 171 


Since in this book we cannot possibly describe the contents of the papers 
of Lappo-Danilevskii in sufficient detail, we have had to restrict ourselves 
to giving above statements of a few basic results and we must refer the reader 
to the appropriate literature 

All the papers of Lappo-Danilevskii that deal with differential equations 
have been published posthumously in three volumes ([29]: Mémoires sur la 
théorte des systemes des équations dif férentielles linéaires (1934-36) ). More- 
over, his fundamental results are expounded in the papers [252], [253], 
[254] and the small book [28]. A concise exposition of some of the results 
can also be found in the book by V. I. Smirnov [56], Vol. IIT. 


CHAPTER XV 


THE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS 


§ 1. Introduction 


In Chapter XIV, §3 we explained that according to Lyapunov’s theorem 
the zero solution of the system of differential equations 


° bd & 
Sagat (08 0 
(Qu, (4,4 =1,2,...,n) are constant coefficients) with arbitrary terms (**) 
of the second and higher orders in 21, Xo, ..., Z, is stable if all the character- 
istic values of the matrix A = | Ax | , 1e., all the roots of the secular equa- 
tion 4(1) =| AE — A|=0, have negative real parts. 

Therefore the task of establishing necessary and sufficient conditions 
under which all the roots of a given algebraic equation lie in the left half- 
plane is of great significance in a number of applied fields in which the 
stability of mechanical and electrical systems is investigated. 

The importance of this algebraic task was clear to the founders of the 
theory of governors, the British physicist J. C. Maxwell and the Russian 
scientific research engineer I. A. Vyshnegradskii who, in their papers on 
governors,’ established and extensively applied the above-mentioned alge- 
braic conditions for equations of a degree not exceeding three. 

In 1868 Maxwell proposed the mathematical problem of discovering cor- 
responding conditions for algebraic equations of arbitrary degree. Actually 
this problem had already been solved in essence by the French mathematician 
Hermite in a paper [187] published in 1856. In this paper he had estab- 
lished a close connection between the number of roots of a complex poly- 
nomial f(z) in an arbitrary half-plane (and even inside an arbitrary 
triangle) and the signature of a certain quadratic form. But Hermite’s 


1J. C. Maxwell, ‘On governors’ Proc. Roy. Soc. London, vol. 10 (1868); I. A. Vyshne- 
gradskii, ‘On governors with direct action’ (1876). These papers were reprinted in the 
survey ‘Theory of automatic governors’ (Izd. Akad. Nauk SSSR, 1949). See also the 
paper by A. A. Andronov and I. N. Voznesenskii, ‘On the work of J. C. Maxwell, I. A. 
Vyshnegradskii, and A. Stodol in the theory of governors of machines.’ 


172 


§ 2. Caucuy INDICES | 173 


results had not been carried to a stage at which they could be used by spe- 
cialists working in applied fields and therefore his paper did not receive 
due recognition. 

In 1875 the British applied mathematicii.n Routh [47], [48], using Sturm’s 
theorem and the theory of Cauchy indices, set up an algorithm to determine 
the number k of roots of a real polynomial in the right half-plane (Re z > 0). 
In the particular case k = 0 this algorithm then gives a criterion for stability. 

At the end of the 19th century, the Austrian research engineer A. Stodol, 
the founder of the theorv of steam and gas turbines, unaware of Routh’s 
paper, again proposed the problem of finding conditions under which all the 
roots of an algebraic equation have negative real parts, and in 1895 A. Hur- 
witz [204] on the basis of Hermite’s paper gave another solution (independ- 
ent of Routh’s). The determinantal inequalities obtained by Hurwitz are 
known nowadays as the inequalities of Routh-Hurwitz. 

However, even before Hurwitz’ paper appeared, the founder of the 
modern theory of stability, A. M. Lyapunov, had proved in his celebrated 
dissertation (‘The general problem of stability of motion,’ Kharkov, 1892)? 
a theorem which yields necessary and sufficient conditions for all the roots 
of the characteristic equation of a real matrix A= || a, ||1 to have nega- 
tive real parts. These conditions are made use of in a number of papers 
on the theory of governors.’ 

A new criterion of stability was set up in 1914 by the French mathema- 
ticians Liénard and Chipart [259]. Using special quadratic forms, these 
authors obtained a criterion of stability which has a definite advantage over 
the Routh-Hurwitz criterion (the number of determinantal inequalities in 
the Lienard-Chipart criterion is roughly half of that in the Routh-Hurwitz 
criterion). 

The famous Russian mathematicians P. L. Chebyshev and A. A. Markov 
have proved two remarkable theorems on continued-fraction expansions of a 
special type. These theorems, as will be shown in § 16, have an immediate 
bearing on the Routh-Hurwitz problem. 

The reader will see that in the sphere of problems we have outlined, the 
theory of quadratic forms (Vol. I, Chapter X) and, in particular, the theory 
of Hankel forms (Vol. I, Chapter X, § 10) forms an essential tool. 


§ 2. Cauchy Indices 


1. We begin with a discussion of the so-called Cauchy indices.‘ 


2 See [32], § 20. 
3 See, for example, [102]. 


174 XV. THe ProsLem or RouTH-HuRWITZ AND RELATED QUESTIONS 


DEFINITION 1: The Cauchy index of a real rational function R(z) 
between the limits a and b (notation: I2R(z); a and b are real numbers or 
+o) is the difference between the numbers of jumps of R(x) from — o to 
+0 and that of jumps from +o to — o as the argument changes from 
atob.® 

According to this definition, if 


R (x)= )s 
t=] 
where A,, a (t1=1, 2, ..., p) are real numbers and #;(x) is a rational 
function® without real poles, then’ 
Pp 
IS R(x)=_» sign A, (2) 
and, in general, : = 
I, R(2) =>) sign, (<0). (2’) 
a<cag<d 
In particular, if f(7) = ad) (w7— a@,)"---(x—a,,)"™ is areal polynomial 


(a, 54 a; fori k; 1,k=1, 3 ...,m) and if among its roots aj, G2, ..., Gm 
only the first p are real, then 


f sy47 
je =F 6-2 3% + Ry (x), (2”) 


where f,(z) is a real rational function without real poles. 
Therefore, by (2’): The index 


f(z) 
Ie f(z) 
as equal to the number of distinct real roots of f(x) in the interval (a,b). 
An arbitrary real rational function R(z) can always be represented in 
the form 


(a<b) 


P( 4) A® 
Ria) = | : tet Ae my (x), 


tml | 2— ay (a—a,)™ 
where all the a and A are real numbers (A“) 40; 1=1, 2,..., p) and 
R,(az) has no real poles. 
Then 


5In counting the number of jumps, the extreme values of z—the limits a and b—are 
not included. 

©The poles of a rational function are those values of the argument for which the 
function becomes infinite. 

7 By signa (a is a real number) we mean + 1, —1, or 0 according as a> 0, a< 0, 
or a= 0. 


§ 2. Caucuy INDICES 175 


. - FE R(z)== DS sign Al} (3) 
and, in general,® . (ng odd ) 
IeR(2)= 2 Am (@<0). (3’) 
( m odd 


2. One of the methods of computing the index J ’R(xz) is based on the 
classical theorem of Sturm. 
We consider a sequence of real polynomials 


fi(x), fo(z). roy fm(x) (4) 


that has the two following properties with respect to the interval (a, b) :° 


1. For every value z (a< xz <b), if any f,(z) vanishes, the two adja- 
cent functions f,-1(z) and fx4:(2) have values different from zero and of 
opposite signs; i.e., for a << z < b it follows from f;,(72) =0 that 


fu—1(2) fe4i(e) < 0. 


2. The last function f,(z) in (4) does not vanish in the interval (a, b) ; 
ie, fun(z) ~0 fora<zr< os. 


Such a sequence (4) of polynomials is called a Sturm chaan in the inter- 
val (a,b). 

We denote by V(x) the number variations of sign in (4) for a fixed 
value x.1° Then the value of V(x), as x varies from a to b, can only change 
when one of the functions in (4) passes through zero. But by 1., when the 
functions f,(z) (k=2,..., m—1) pass through zero, the value of V(x) 
does not change. When f;(z) passes through zero, then one variation of 
sign in (4) is lost or gained according as the rat’o fe(z)/fi(<) goes from 
— o to+ o or vice versa. Hence we have: 


THEOREM 1 (Sturm): If f;(x), fo(z), ..., fm(x) ts a Sturm chain in 
(a,b) and V(x) is the number of variations of sign in the chan, then 


Lee =V(a)—V 0). (5) 


8 In (3) the sum is extended over all the values i for which the corresponding 7, is odd. 
In (3’) the sum is extended over all the 7 for which m is odd anda < a < BD. 

® Here a may be — oo and Bb may be + oo. 

10OTf a<2< b and fi(z) <0, then by 1. in the determination of V(x) a zero value 
in (4) may be omitted or an arbitrary sign may be attributed to this value. If a is finite, 
then V (a) must be interpreted as V(a+e), where é€ is a positive number sufficiently 
small that in the half-closed interval (a,a-+«] none of the functions f;(z) vanishes. 
In exactly the same way, if b is finite, V(b) is to be interpreted as V(b —«), where the 
number é is defined similarly. 


176 XV. THE Prosiem or Routs-Hurwitz anp RELATED QUESTIONS 


Note. Let us multiply all the terms of a Sturm chain by one and the 
same arbitrary polynomial d(z). The chain of polynomials so obtained is 
called a generalized Sturm chain. Since the multiplication of all the terms 
of (4) by one and the same polynomiai alters neither the left-hand nor the 
right-hand side of (5), Sturm’s theorem remains valid for generalized 
Sturm chains. 

Note that if f(z) and g(2z) are any two polynomials (where the degree 
of f(z) is not less than that of g(z)), then we can always construct a gen- 
eralized Sturm chain (4) beginning with fi(z) =f(z), fo(z) =g(z) by 
means of the Euclidean algorithm. 

For if we denote by —/f;(2) the remainder on dividing f;(z) by fe(z), 
by —/f,(x) the remainder on dividing f.(z) by fs(z), ete., then we have 
the chain of identities 


fy (2) = G1 (2) fg (2) — fg (2), 


me (x)= Qy—y (2) fy (2) — fea (2); (6) 


> ee e @© @ @ ®© @® @® 8e« @ @® @® e# @ \ e e 


pe (x)= = In-1 (x) be (2) ’ 


where the last remainder f,(z) that is not identically zero is the greatest 
common divisor of f(z) and g(x) and also of all the functions of the 
sequence (4) so constructed. If fn(z) #0 (a < x < b) then this sequence 
(4) satisfies the conditions 1., 2. by (6) and is a Sturm chain. If the 
polynomial f,,(z) has roots in the interval (a,b), then (4) is a generalized 
Sturm chain, because it becomes a Sturm chain when all the terms are 
divided by fm(x). 

From what we have shown it follows that the index of every rational 
function R(x) can be determined by Sturm’s theorem. For this purpose 


it is sufficient to represent R(x) in the form Q(z) + a where Q(z), f(z), 


g(x) are polynomials and the degree of g(z) does not exceed that of f(z). 
If we then construct the generalized Sturm chain for f(z), g(x), we have 


_. po g(%) __ = 
LAR (2) = Ia =V (a) — V(b). 
By means of Sturm’s theorem we can determine the number of distinct 
real roots of a see aa f(a) in the interval (a,b), since this number, as 


we have seen, is Ppr® P'(2) 


@ f(z)" 


§ 3. RoutH’s ALGORITHM 177 


§ 3. Routh’s Algorithm 


1. Routh’s problem consists in determining the number k of roots of a 
real polynomial f(z) in the right half-plane (Rez > 0). 
To begin with, we treat the case where f(z) has no roots on the imaginary 
azis. In the right half-plane we construct the semicircle of radius RF with 
its center at the origin and we consider the domain 
bounded by this semicircle and the segment of the imagi- 
nary axis (Fig.7). For sufficiently large F all the zeros 
of f(z) with positive real parts lie inside this domain. 
Therefore arg f(z) increases by 2kx on going in the 
positive direction along the contour of the domain.*? On 
the other hand, the increase of arg f(z) along the semi- 
circle of radius R for R— o is determined by the in- 
crease of the argument of the highest term az” and is 
therefore nz. Hence the increase of arg f(z) along the 
Fig. 7 imaginary axis (R — o) is given by the expression 


AtS arg f (iw) = (n—2k) x. (7) 


We introduce a somewhat unusual notation for the coefficients of f(z) ; 
namely, we set 


f (z) =age” + doz) + aye”? + dz? +--- (ag AO). 
Then 
f (tw) =U (w) + 4V (w),. (8) 
where for even n 


U (w) = (— 1) 2 (Apw" — aw"? +- apw" 4 — ---), | 


n_ (8”) 
V (w) =(—1)? — (bgw”—? — Bw" + Bg" — - - -) { 
and for odd 
n—1 
U (w) = (— 1) 2 (bgw" 4 = byw" + baw" cand ‘); (8’’) 
n—1 


V (w)=(—1) 2 (agw* — ayw”—* + agw™ 4 — + +>), 


" 
11 For if f(z) = ay 1 (e—e.), then A arg f(z) = 2 A arg (e—2z:). If the point 


(=1 
e, lies inside the domain in question, then 4 arg (2 —e.) = 2x; if e. lies outside the 
domain, then 4 arg (e — a.) = 0. 


178 XV Tue ProsieMm or RoutH-Hugwitz aNnp RELATED QUESTIONS 
Following Routh, we make use of the Cauchy index. Then’? 


eZ) tor jim UO) = = 


4 a ead oe V(e) 7 
wae +2 arg f ( tw nats Ee die ray FF ee @) =0 | , 
00) nen O08) 


The equations (8’) and (8”) show that for even n the lower formula in 
(9) must be taken and for odd n, the upper. Then we easily obtain from 
(7), (8’), (8), and (9) that for every n (even or odd)’ 

400 byawm—1 — B,om—3 4 -2- 

“apart 8 oe: (10) 
2. In order to determine the index on the left-hand side of (10) we use 
Sturm’s theorem (see the preceding section). We set 


fy (@) =agu"—aywo™ 2 +++, fy (w)= Bgw** — byw" + +> (11) 
and, following Routh, construct a generalized Sturm chain (see p. 176) 


fy (@), fa (@), fg (@), - ++ fm (@)- (12) 


by the Euclidean algorithm. 

First we consider the regular case: m=n+1. In this case the degree 
of each function in (12) is one less than that of the preceding, and the last 
function f(@) is of degree zero,** 

From Euclid’s algorithm (see (6) ) it follows that 


c | 
fs (wo) =5 @ fe (w)— fy (w) = ¢gw”* — €,00"—* + Cgw"> — ---, 
where ; ; ‘ : 
207 —, 20% 4% Se OO 90%" Fo%2 
Co Oy ae bt Mar Ga e b= ard (13) 
Similarly 
fy (w) =—ofs w) — fg (w) =dgw — dyw*— + oe, 
where 
b Cb, —- doc 7) Cob. — Boe ; 
dy=b,— 5 = ear aac d,=b,— 2 = ove Pal saey (13’) 


The coefficients of the remaining polynomials f,(w),..., fn+1(w) are simi- 
larly determined. 


12 Since arg f (4m) = arccot a i= = arctan aot 
13 We recall that the formula (10) was derived under the assumption that f(z) has no 
roots on the imaginary axis. 


14 In the regular case (12) is the ordinary (not generalized) Sturm chain. 


§ 3. Rours’s ALcorITHM 179 


Each polynomial A = 
fi(@), fgi@)s s+) fata (} (14) 
is an even or an odd function and two adjacent polynom‘als always have 
opposite parity. 

We form the Routh scheme 


Ay, Gy, Ge, --., 
Bo, 03, 5, ...; 
Ci Ore Wes ang (15) 


dy, dy, dy, .-., 


The formulas (13),:(13’) show that every row in this scheme is deter- 
mined by the two preceding rows according to the following rule: 


From the numbers of the upper row we subtract the corresponding num- 
bers of the lower row multiplied by the number that makes the first differ- 
ence zero. Omitting this zero difference, we obtain the required row. 

The regular case is obviously characterized by the fact that the repeated 
application of this rule never yields a zero in the sequence 


Bo, Co; do, oe e @ 


Figs. 8 and 9 show the skeleton of Routh’s scheme for an even n (n= 6) 
ind an odd nm (n=7). Here the elements of the scheme are indicated by 
lots. 

In the regular case, the polynomials f,;(w) and f2(@) have the greatest 
-ommon divisor f,41(@) =const. 0. Therefore these polynomials, and 
hence U(w) and V(w) (see (8"), (8”), and (11)), do not vanish simul- 
taneously; 1.e., f(aw) = U(w) +2V(w) £0 for real mw. Therefore: In the 
regular case the formula (10) holds. 

When we apply Sturm’s theorem 
in the interval (— o, + o ) to the left- 
hand side of this formula ard make 
use of (14), we obtain by (10): 

V (— 0) — V(+ wo)=n—2Z2k, (16) 
In our case?® 
V (+ 0) =V (ap, Do, Cg, dq, - - -) 
Fig. 7 Fig. 8 
and 


15 The sign of f.(w) for w =-++ © coincides with the sign of the highest coefficient 
and for # =— o differs from it by the factor (— 1)"~*+1(k—1, 2,...,n +1). 


180 XV. THe ProBLemM or RouTH-HuRWITZ AND RELATED QUESTIONS 
V (— ©)= V (ao, — bo, Cp, — dg; - -+)- 


Hence 
V (—o)=n— V (+0). (17) 


From (16) and (17) we find: 
k = V (ap; bo, Co» do, ee ) ° (18) 


Thus we have proved the following theorem : 


THEOREM 2 (Routh): The number of roots of the real polynomial f(z) 
in the right half-plane Re z < 0 1s equal to the number of variations of sign 
in the first column of Routh’s scheme. 


3. We consider the important special case where all the roots of f(z) have 
negative real parts (‘case of stability’). If in this case we construct for the 
polynomials (11) the generalized Sturm chain (14), then, since k =0, the 
formula (16) can be written as follows: 


V(— 0) — V(+ oo) =n. (19) 


But OS V(— ©) Sm—1Sn and Of V(+ 0) Sm—1s=n. There 
fore (19) is possible only when m= -+ 1 (regular case!) and V(+ 0) =0, 
V(— 0) =m—1=n. The formula (18) then implies: 


RoutH’s Criterion. All the roots of the real polynomial f(z) have 
negative real parts if and only if wn the carrying out of Routh’s algorithm 
all the elements of the ferst column of Routh’s scheme are different from 
zero and of like sign. 


4. In deriving Routh’s theorem we have made use of the formula (10). 
In what follows we shall have to generalize this formula. The formula (10) 
was deduced under the assumption that f(z) has no roots on the imaginary 
axis. We shall now show that in the general case, where the polynomial 
f (2) = age” + bge"—) + a, 2*-2 4... (Go #0) has & roots in the right half- 
plane and s roots on the imaginary axis, the formula (10) is replaced by — 
i= bon! — bons + b,wn—5 — 
Aown — a,wr 2 + a,wn—4 — 


f(z) =d(z)f*(z), 


_=n—2k—s. (20) 


For 


where the real polynomial d(z) =2° +... has s roots on the imaginary 
axis and the polynomial f*(z) of degree n® = n — s has no such roots. 


§ 4. Tae Sineuuar Case 181 


For the sake of definiteness, we consider the case where s is even (the 
ease where s is odd is analyzed similarly). 
Let . 
f (tw) = U (w) + 1V (w) =d (tw) [U* (w) + tV* (o)]. 


Since in our case d(1m) is a real polynomial in w, we have 


U(w) __ U* (a) 
V(w) V*{w)~ 


Since 7 and n* have equal parity, we find by using (8’), (8), and the nota- 
tion (11): 

f,(w) ane fz (a) 

fi(w) fi (w)° 


We apply formula (10) to f*(z). Therefore 


+0 f2(@)_ z+oofZ(w) ae. 
Tres f(y = Ie frgy = 2k aM 2k—s, 


and this is what we had to prove. 


§ 4. The Singular Case. Examples 


1. In the preceding section we have examined the regular case where in 
Routh’s scheme none of the numbers bo, Co, do, . . . vanish. 

We now proceed to deal with the singular cases, where among the num- 
bers bo, co, ... there occurs a zero, say, hy =0. Routh’s algorithm stops with 
the row in which h, occurs, because to obtain the numbers of the following 
row we would have to divide by ho. 

The singular cases can be of two types: 


1) In the row in which hy occurs there are numbers different from zero. 
This means that at some place of (12) the degree drops by more than one. 


2) All the numbers of the row in which ho occurs vamsh simultaneously. 
Then this row is the (m+ 1)-th, where m is the number of terms in the 
generalized Sturm chain (12). In that case, the degrees of the functions in 
(12) decrease by unity from one function to the next, but the degree of 
the last function f,,(@) is greater than zero. In both cases the number of 
functions in (12) ism < n+1. 

Since the ordinary Routh’s algorithm comes to an end in both cases, 
Routh gives a special rule for continuing the scheme in the cases 1), 2). 


182 XV. THE PROBLEM oF RouTH-HuRwWITz AND RELATED QUESTIONS 


2. In case 1), according to Routh, we have to substitute for hp =0 a ‘small’ 
value ¢ of definite (but arbitrary) sign and continue to fill in the scheme. 
Then the subsequent elements of the first column of the scheme are rational 
functions of «. The signs of these elements are determined bv the ‘small- 
ness’ and the sign of ce. If any one of these elements vanishes identically in e, 
then we replace this element by another small value 7 and continue the 
algorithm. 
Example: 
f(z) =2* + 23 + 22? + 224+1. 


Routh’s scheme (with a small parameter ¢): 


lL, 21 
lL 2 
e 1 k=V(,1,e2——,1)=2. 
ot 
é 
1 


This special method of varying the elements of the scheme is based on 
the following observation : 


Since we assume that there is no singularity of the second type, the 
functions f;(@w) and fe(@) are relatively prime. Hence it follows that the 
polynomial f(z) has no roots on the imaginary axis. 

In Routh’s scheme all the elements are expressed rationally in terms of 
the elements of the first two rows, i.e., the coefficients of the given poly- 
nomial. But it is not difficult to observe in the formulas (13), (13’) and 
the analogous formulas for the subsequent rows that, once we have given 
arbitrary values to the elements of any two adjacent rows of Routh’s scheme 
and to the first element of the preceding row, we can express all the elements 
in the first two rows, 1e., the coefficients of the original polynomial, in 
integral rational form in terms of these elements. Thus, for example, all 
the numbers a, b can be represented as integral rational functions of 


Ap, Do, Co, - ~~, Gor Jar Jay +++ 4 Noy Mr, ha, .- 


Therefore, in replacing g) =0 by « we in fact modify our original poly: 
nomial. Instead of the scheme for f(z) we have the Routh scheme for. 
polynomial F(z, €), where F(z, €) 1s an integral rational function of z and « 
which reduces to f(z) for e=0. Since the roots of F(z, €) change continu. 
ously with a change of the parameter e« and since there are no roots on the 
imaginary axis for e = 0, the number & of roots in the right half-plane is the 
same for F(z,e) and F(z,0) =f(z) for values of ¢ of small modulus. 


§4. Tue SINGULAR CASE 183 


g3,. Let us now proceed to a singularity of the second type. Suppose that 
in Routh’s scheme 


Go 0, 00 0, ..-, Co FO, Go =D, 91 = 0, g2 =0,7... 


In this case, the last polynomial in the generalized Sturm chain (16) is of 
the form: 
f m (@) == €geo*—M+1 — €,qyn—m—1 4. «--, 


Routh proposes to replace fm41(w@), which is zero, by fn(w) ; ie. he 
proposes to write instead of go, gi, ... the corresponding coefficients 


(n—m+l1)@, (n—m—1)&,... 


and to continue the algorithm. 
The logical basis for this rule is as follows: 


By formula (20) 


(the s roots of f(z) on the imaginary axis coincide with the real roots of 
fm(w)). Therefore, if these real roots are simple, then (see p. 174) 


and therefore 


foo fg(o) , ztoo hn (@) 


ee AC) esa 3 Va 


This formula shows that the missing part of Routh’s scheme must be filled 
by the Routh scheme for the polynomials f,(w) and fm(w). The coeffi- 
cients of f,,(w) are used to replace the elements of the zero row in Routh’s 
scheme. 

But if the roots of fm(w) are ‘not simple, then we denote by d(w) the 
greatest common divisor of fm(@) and fn(@), by e(w) the greatest common 
divisor of d(w) and d’(w), etc., and we have: 


—* fin(w) " ~~" d(w) " ~—™ e(w) 


ecre== 8, 


Thus the required number k can be found if the missing parc of Routh’s 
scheme is filled by the Routh scheme for fm(w) and fm(w), then the scheme 
for d(w) and d’(w), then that for e(w) and e’(w), etc., i.e., Routh’s rule 
has to be applied several times to dispose of a singularity of the second type. 


184 XV. Tue ProsuEem or RoutH-HuRwWITZ aND RELATED QUESTIONS 


Example. f(z) = 2! + 2—28— 2274+ 24325 +24#—228—z?42+1. 


Scheme 
wo -}—1 1 1-11 
wo I—2 3-2 1 
w® 1 —-2 3 —2 1 
,f 8—l2 12 -4 
{ a a | 
w —l 3 —3 2 
sf 3 —3 8 
ut { a ee | k= V(1,1,1,2,—1,1,1,2,—1,1,1) =4. 
“f seed 3 
at | a a 
4 —2 
a? | 2 —1 
wm? —l] 2 
w 1 
2 
we { 1 


Note. All the elements of any one row may be multiplied by one and 
the same number without changing the signs of the elements of the first 
column. This remark has been used in constructing the scheme. 


4. However, the application of both rules of Routh does not enable us to 
determine the number & in all the cases. The application of the first rule 
(introduction of small parameters ¢«. ...) is justified only when f(z) has 
no roots on the imaginary axis. 

If f(z) has roots on the imaginary axis, then by varying the parameter s 
some of these roots may pass over into the right half-plane and change k. 


Example. f(z) =2+2+32°-4+32 4323742241. 


Scheme 
we 1 3 3 1 
oF 1 3 2 | 
a* é€ 1 1 
a | 
1 ae 
wo st et ae eae oa 
é € i | 
3 
4 
w* } 2e—l1 ] 
as 
€ 
7 ; 4 for e>0O 
= ; V (1,1,43——, 1, —e, ee for «<0 


§ 5. Lyapunov’s THEOREM 185 


The question of the value of k remains open. 
In the general case, where f(z) has roots on the imaginary axis, we have 
to proceed as follows: 


Setting f(z) = F(z) + F(z), where 
F(z) = ao2" + ay2"*-27+..., F2(Z) = boe™—1 + dy2*-7 +. .., 


we must find the greatest common divisor d(z) of Fi(z) and Fe(z). Then 
f(z) =d(z)f*(z). 

If f(z) has a root z for which —2z is also a root (all the roots on the 
imaginary axis have this property), then it follows from f(z) =0 and 
f(— 2) =Othat F(z) =0 and Fo(z) =0,i-e., zis a root of d(z). ‘Therefore 
f*(z) has no roots z for which —z is also a root of f*(z). 

Then 

k= Ry + ke, 


where k, and ke are the respective numbers of roots of f*(z) and d(z) in the 
right half-plane; &, is determined by Routh’s algorithm and kz= (q — s) /2, 
where q is the degree of d(z) and-s the number of real roots of d(t@) .'® 

In the last example, 


d(z)=2+1, f*(2) =2¢+ 25 + 227 + 22 +1. 
Therefore (see example on p. 182), we have kz =0, k, = 2, and hence 


k=2. 


§ 5. Lyapunov’s Theorem 


1. From the investigations of A. M. Lyapunov published in 1892 in his 
monograph ‘The General Problem of Stability of Motion’ there follows a 
theorem" that gives necessary and sufficient conditions for all the roots of 
the characteristic equation | AH — A |=0 of a real matrix A= | yy {| to 
have negative real parts. Since every polynomial 


f(A) =aod® + a A*—-1 +... tay (ao 0) 


16 d(s w) is a real polynomial or becomes one after cancelling «. The number of its real 
roots can be determined by Sturm’s theorem. 


37 Bee [32], § 20. 


186 XV. THe PrRosLeM or RoutH-HuRwITz AND RELATED QUESTIONS 


can be represented as a characteristic determinant | 4E — A |,?* Lyapunov’s 
theorem is of general character and is applicable to an arbitrary algebraic 
equation f (4) =0. 

Suppose given a real matrix A = || ay, ||; and a homogeneous polynomial 
of dimension m in the variables 2;, Ze, ..., Zn? 


V (a, z,..., &) ei Fas cco s ®t): 
’ 7 
Let us find the total derivative with respect to ¢t of V(x, z,..., 2) under 


the assumption that z is a solution of the differential system 


dz 
dt =—Ar. 


Then 


£V(q, Z,..-, X%)=V (Az, 2, eae) 
+V (a, Av,..., 2) +°°-+V (az, 2,..., AZ) 


=W (2, 2,...,2), (21) 
where W(z, x,..., Z) is again a homogeneous polynomial of dimension m 
in 21, Zo,...,%,. The equation (21) defines a linear operator A which asso- 
ciates with every homogeneous polynomial of dimension m V(z, z,..., Z) 
a certain homogeneous polynomial W(x. x,..., x) of the same dimension m 
W=A(P). 
We restrict ourselves to the case m=2.'® Then V(z,x) and W(z,z) 
are quadratic forms in the variables x1, 22, ... , Z, connected by the equation 
d 
ar V (a, x) =V (Ag, x) + V(x, Ax) = W (2, 2); (22) 
hence”® a 
W=A(V)=A'V+ VA. (23) 
18 For this purpose it is sufficient to set, for example: 
0 O.40 = 
A 
10... 0 n= 
Ay 
A= : : 
00...1 —% 
Q 


19 A, M. Lyapunov has proved his theorem for every positive integer m. 
20 Because V (2, y) = 2" Vy. ‘ : Be” aad 


§ 5. Lyapunov’s THEOREM 187 


Here V= | Vix if and W= | Wik || are symmetric matrices formed, 
respectively, from the coefficients of the forms V(z,7) and W(a2,z). The 
linear operator A in the space of matrices of order m is completely deter- 
mined by specification of the matrix A= || ay ||1. 

If 41, Ao, ..., A, are the characteristic values of the matrix A, then every 
characteristic value of the operator A can be represenicd in the form 
Ay t+ Ax (1 S41, k = n).”} 

Therefore, if the matrix A = || ay, ||1 has no zero characteristic value and 
no two that are opposites, then the operator A is non-singular. In this case 
the matrix W in (23) determines the matrix V uniquely. 

If V is symmetric, then the matrix W defined by (23) is also symmetric. 
If A is a non-singular operator, then the converse statement also holds: 
Every symmetric matrix W corresponds by (23) to a symmetric matrix V 
For in this case we find, by guing over to the transposed matrices on both 
sides of (23), that the matrix V", as well as V, satisfies (23). By the unique- 
ness of the solution, V' = V. 

Thus: If the matrix A= || du || has no zero and no two opposite char- 
acteristic values, then every quadratic form W(az, x) corresponds to one and 
only one quadratic form V(2,x) connected with W(x, x)by (22). 

Now we can formulate Lyapunov’s theorem. 


TuHeorEM 3 (Lyapunov): If all the characteristic values of the real 
matric A= || ax \|2 have negative real parts, then to every negative-definite 
quadratic form W(x,xz) there corresponds a posttive-definite quadratic form 
V(x, x2) connected with W (x, x)—taking 

dz 


into account—by the equation 
d 
a V(%, 2) = W (a, 2). (25) 


Conversely, if for every negatwe-definite form W(x,x) there exists a 
positive-definite form V(x, x) connected with W(x, x) by the equation (25) 
—taking (24) into account—then all the gharacteristic values of the matrix 
A= || au |i have negative real parts. 
Proof. 1. Suppose that all the characteristic values of A have negative 
eal parts. Then for every solution x= e4'z, of (24) we have jim x = 0,”? 


Suppose that the forms V(z, z) and W(«, x) are connected by (25) and that 


21 See footnote 18. 
22 See Vol. I, Chapter V, § 6. 


188 XV. Tue PROBLEM or RoutH-HuRWITZ AND RELATED QUESTIONS 


W(a,2) <0 (2%#¥o).* 

Let us assume that for some 2% 7 0 

Vio = V (a, %) SO. 

But s V(2,x2) = W(2z,7) <0 (x= e4*az,). Therefore for ¢ > 0 the value 
of V (2, x) is negative and decreases for t > o, which results in a contradic- 
tion to the equation lim V (z, z)= lim V(z2,%)=0. Therefore V(z,z) >0 
for x 0, i.e., V(x, x) is a positive-definite quadratic form. 

2. Suppose, conversely, that in (25) 

W(2,2) <0, V(z,2) >0 (2 540). 
From (25) it follows that 


| 
V (x, x)= V (Xo, %) + il W (x, x) at (a= e4' a). (25°) 
0 


We shall show that for every xo<< o the column z = e“*z, comes arbitrarily 
near to zero for arbitrarily large values of t > 0. Assume the contrary. 
Then there exists a number v > 0 such that 


W (2, z)<—v<0 (x =e“*x,, tx%0, t>0). 
But then from (25’) 
V (x, z) <V (aq, Xp) — Et, 
and so for sufficiently large values of t we have V(x, 2) < 0, which contra- 
dicts our assumption. 
From what we have shown, it follows that for certain sufficiently large 
values of ¢ the value of V(2, x) (x= e4*z, 9 ~ 0) will be arbitrarily near 


to zero. But V(z,x) decreases monotonically for ¢ > 0, since s V(z,z2)= 
W(a2,x)'< 0. Therefore lim V (a, xz) =0. 

Hence it follows that for every xo, lime4*z, =o, ie., lime4#*=0O. 
This is only possible if all the characteristic vraltiss of A have tive real 
parts (see Vol. I, Chapter V, § 6). 


The theorem is now completely proved. 
For the form W(z, x) in Lyapunov’s theorem we can take any negative- 


nr 
definite form, in particular, the form = a? In this case the theorem 
t=] - 


admits of the following matrix formulation: 


23 The form W(z, 2x) is given arbitrarily. The form V(z,z) is uniquely determined 
by (25), because A has in this case neither the characteristic value zero nor pairs of 
opposite characteristic values. 


§ 5. LyaPpunov’s THEOREM 189 


THEOREM 3’: All the characteristic values of the real matrix A= || an i 
have negative real parts of and only vf the matriz equation 
A’V+ VA=—E# (26) 


has as its solution V the coefficient matrix of some positive-defimite quad- 
ratec form V(x,2) > 0. 


2. From this theorem we derive a criterion for determining the stability of 
a non-linear system from its linear approximation.‘ 

Suppose that it is required to prove the asymptotic stability of the zero 
solution of the non-linear system of differential equations (1) (p. 172) in 
the case where the coefficients a, (1,4 —=1, 2,..., ~) in the linear terms 
on the right-hand side form a matrix A = | dix If having only characteristic 
values with negative real parts. Then, if we determine a positive-definite 
form V(z,x) by the matrix equation (26) and calculate its total derivative 
with respect to time under the assumption that += (2, re, ..., 2) is a 
solution of the given system (1), we have: 


d Tt 
a V (2, «)= Ps + R(x, %y,-.., %,), 
j= 
where (21, x2, ..., Zn) is a series containing terms of the third and higher 
total degree in 271, Z2,..., %,. Therefore, in some sufficiently small neigh- 
borhood of (0, 0,..., 0) we have simultaneously for every 20 


BAe ae x)<O. 


V (a2, z)>0, at 


By Lyapunov’s general criterion of stability?® this also indicates the 
asymptotic stability of the zero solution of the system of differential equa- 
tions. 

If we express the elements of V from the matrix equation (26) in terms 
of the elements of A and substitute these expressions in the inequalities 


11 V419 oee Vin 
v 
v%,>0, | 7 lee vecy [BL Paar Pant g, 
Vo1 Vag ee ee ee ee ae 
Yn1 Une nn 
then we obtain the inequalities that the elements of a matrix 4= || ay, ||1 


must satisfy in order that all the characteristic values of the matrix should 


24 See [32], § 26; [9], pp. 113 ff.; [36], pp. 66 ff. 
25 See [32], § 16; [9], pp. 19-21 and 31-33; [36], pp. 32-34. 


190 XV TE PROBLEM OF ROoUTH-HURWITZ AND RELATED QUESTIONS 


have negative real parts. However, these inequalities can be obtained in a 
considerably simpler form from the criterion of Routh-Hurwitz, which will 
be discussed in the following section. 

Note. Lyapunov’s theorem (3) or (3’) can be generalized immediately 
to the case of an arbitrary compler matrix A= || au I]a - The quadratic 
forms V(z,z) and W(z,2z) are then replaced by Hermitian forms 


V (a, 2) = D' vyra,, W (2,2) = DS) war zy. 


t,kol wy tl 


Correspondingly, the matrix equation (26) is replaced by the equation 


AV+VA=—E (A° = A’), 


§ 6. The Theorem of Routh-Hurwitz 


I. In the preceding sections we have explained the method of Routh, un- 
surpassed in its simplicity, of determining the number k of roots in the right 
half-plane of a real polynomial whose coefficients are given as explicit 
numbers. If the coefficients of the polynomial depend on parameters and 
it is required to determine for what values of the parameters the number k 
has one value or another—in particular, the value 0 (‘domain of stability’)*° 
——then it is desirable to have explicit expressions for the values of Co, do, ... 
in terms of the coefficients of the given polynomial. In solving this problem, 
we obtain a method of determining k and, in particular, a stability criterion 
in a form'in which it was established by Hurwitz [204]. 
We again consider the polynomial 


f (z) = age” + bgz™—! + az”? + Byz™ 3 +--+ (ag5£0). 


By the Hurwitz matriz we mean the square matrix of order n 


by by bg... by 
Gg Ay Ag... Ay) a 
O by by... Bas a=0 for k>[3], 
H= 0 ay Ay Qn_2 ; (27) 
n— n—1] 
00 by... By b,=0 for k>| 5 


ee © «© © © © © «@ 


26 For this is precisely the situation in planning new mechanical or electrical systems 


of governors. 


§ 6. THEoREM oF RoutTH-HurwItTz 191 


We transform the matrix by subtracting from the second, fourth, ... 
rows the first, third, ... row, multiplied by ao/bo.27. We obtain the matrix 


[5 5, 53 6,1 
0 a a 
10 by by 6,» 
° 0 C,_s 
0 0 by .-- Bs 


In this matrix ¢o, c1, ... is the third row of Routh’s scheme supplemented by 
zeros (c,=0 for k > [n/2] —1). 

We transform this matrix again by subtracting from the third, fifth, ... 
rows the second, fourth, ... row, multiplied by bo/co: 


Cr er en en a ee ) 


Continuing this process, we ultimately arrive at a triangular matrix of 
order n 


by b, by | 
O GG... 
R=|0 0 d&...||- (28) 


which we eall the Routh matriz. It is obtained from Routh’s scheme (see 
(15)) by: 1) deleting the first row; 2) shifting the rows to the right so that 
their first elements come to lie on the main diagonal; and 3) completing it 
by zeros to a square matrix of order n. 


27 We begin by dealing with the regular case where }) ~ 0, © ~U, dd ~0,..... 


192 XV. THE PRoBLeM oF KouTH-HURWITZ AND RELATED QUESTIONS 


DEFINITION 2: Two matrices A = || ay, ||1 and B= || by ||? will be called 
equivalent if and only tf for every p = 1 the corresponding minors of order 
p in the first p rows are equal: 


) eae | eer Mey Bes bey 4, 1, 2) cam 
AQ ce jaal, P| (ey ) 
t, tg... ty ty tg... p=—1,2,...,%” 


Since we do not change the values of the minors of order p in the first 
p vows when we subtract from any row of the matrix an arbitrary multiple 
of any preceding row, the Hurwitz and Routh matrices H and R are equiva- 
lent in the sense of Definition 2: 


a(! ae —-) gar eertige 5 


ty tg eee tp t tg. t) p=, 2, cee, % 


The equivalence of the matrices H and R enables us to express all the 
elements of R, i.e. of the Routh scheme, in terms of the minors of the Hurwitz 
matrix H and, therefore, in terms of the coefficients of the given polynomial. 
For when we give to p in (29) the values 1, 2, 3, .. . in succession, we obtain 


1 1\ 1 
H| |=bo, H\, |=b, H\,)=by »., 


alt \2, all V2. al! ~~, 
12) 0 13) La ees , (30) 
y(t 28 bed uli ?? ‘ afb? 3 bed 
1 2 3) 00%» 1 2 4) cot 125 0022; 
ete. 
Hence we find the following expressions for the elements of Routh’s 
scheme : , 
b,—H : b,=H : b=—H : 
oN} i 9)’ a= \ a}? , 
1 2 12 12 
. H(; 9) — Alyy . H(, 4) 
0 aes | |, aa 1 l ) 2 ak) Toa yp oeey 
#(;) (1) #(,) (31) 


§ 6. THrorEm or RoutH-Hurwitz 193 


The successive principal minors of H are usually called the Hurwitz 
determinants. We shall denote them by 


i 1 2 by 6 

4,=H(1) =o 4,=H{ ae ale 
] 12 Ay a 

bo by ° b, 4 

l 9 n ao Ay--- a,-1 

4,=H( )= 0 by.» By_e|: (32) 

a eee i @ - 

O° n~—2 


Note 1. By the formulas (30) ,”* 


A,=by Ao =2oto, A3=2otpdo, ---- (33) 


From 4; +0, ..., 4,0 it follows that the first p of the numbers 
bo, Co,... are different from zero, and vice versa ; in this case the p successive 
rows of Routh’s scheme beginning with the third are completely determined 
and the formulas (31) hold for them. 

Note 2. The regular case (all the bo, co, ... have a meaning and are 
different from zero) is characterized by the inequalities 


A: 0, 42.%0, ..., 4,0. 


Note 3. The definition of the elements of Routh’s scheme by means of 
the formulas (31) is more general than that by means of Routh’s algorithm. 


Thus, for example, if bb). = H (G)= 0, then Routh’s algorithm does not give us 


anything except the first two rows formed from the coefficients of the given 
polynomial. However if for 4;—0 the remaining determinants 4), As, ... 
are different from zero, then by omitting the row of c’s we can determine 
by means of the formulas (31) all the remaining rows of Routh’s scheme. 

By the formulas (33), 
by = Ay, = Ft Fp ite 


and thérefore 


28 Tf the coefficients of f(z) are given numerically, then the formulas (33)—reducing 
shis computation, as they do, to the formation of the Routh scheme—give by far the 
implest method for computing the Hurwitz determinants. 


194 XV. THe Prosiem or RoutH-HurwItTz AND RELATED QUESTIONS 


A An 
V (dgybo, 60, - --) =V (ao, Ay Foes Z) =V (a, Ay, M5,...)+ V1, 4g, dy, --.)- 


Hence Routh’s theorem ean be restated as follows: 


THEOREM 4 ( Routh-Hurwitz) - The number of real roots of the poly- 
nomial f(z) =aoz" -+...inthe right half-plane 1s determined by the formula 


_ , 4, A A 
k=V (ao, Aaa. ee sy 7) 7 (34) 
or (what is the same) by 
k = V (ap; A, A, ee .) + V (1, A,, A, ve a): (34’) 


Note. This statement of the Routh-Hurwitz theorem assumes that we 
have the regular case 


A,0, A400, ..., 40. 


In the following section we shall show how this formula ean be used in 
the singular cases where some of the Hurwitz determinants 4; are zero. 


2. We now consicler the special case where all the roots of f(z) are in the 
left half-plane Rez <0. By Routh’s criterion, all the ao, bo, Co, do, . .. must 
then be different from zero and of like sign. Since we are concerned here 
with the regular case, we obtain from ( 34) for k = 0 the following criterion : 


CRITFRION OF RoutH-Hurwitz: All the roots of the real polynomial 
f(z) =ao2" +... (dy 9) have negative real parts if and only if the in- 
equalrtres 
a,A,>0 (for odd n), 
a@4,>0, Ay>0, a,4,>0, 4y>0,..., °." —} (36 

oe ‘ _ . A,>0 (for even n) ae 
hold. 


Note. If a > 0, these conditions can be written as follows: 
A,>0,' 4,>0,..., 4,>0. (36) 
If we use the usual notation for the coefficients of the polynomial 
f (2) =a az + age” + a+ + Gy 42 + Os 


then for a). > 0 the Routh-Hurwitz condition§ (36) can be written in the 
form of the following determinantal inequalities: 


§ 6. THEOREM or Rouru-HurwirTz 195 


a, a3 a, ... 9 
ao ag a, gree 0 
: | @, Gz a, 
|a,|>0, i 2, |>o Gy a, %|>0, .., : 2 = me ; >0. (36’) 
0 : : ; 0 Zrere 
0 a a, hee 
» Be | 


A real polynomial f(z) =a .z" +... whose coefficients satisfy (35), i.e., 
whose roots have negative real parts, is often called a Hurwitz polynomial. 


3. In conclusion, we mention a remarkable property of Routh’s scheme. 


Let fo, fi, ... and go, gi,... be the (m +1)-th and (m + 2)-th rows of the 
scheme (fo= 4m/A4m—1, Jo A4mi1/4m). Since these two rows together 
with the subsequent rows form a Routh scheme of their own, the elements 
of the (m+p+1)-th row (of the original scheme) can be expressed in 
terms of the elements of the (m+ 1)-th and (m+ 2)-th rows fo, f;, ... and 


Jo, 91, .-- by the same formulas as the (p + 1)-th row can in terms of the 
elements of the first two rows do, a;,... and bo, bi, ... ; that is, if we set 
Jo 9: Gs -- 
lh th fa - 
or! 0 J 91. 
H=' 0 1 : 
|o hh | 
| eee | 


then we have 


1 gee eee m-+ p 7 ee a p 
1...m+p—-l1 m+p+k—1} _ 1...p—1 ptk~—J] 
eo) = ci teacoie ae (37) 
1...m+p—l 1...p—l 


The Hurwitz determinant 4,,,, is equal to the product of the first m + p 
numbers in the sequence bo, Co, ... : 


An+p = doly oo fo9o eee l,. 
But 
Am = bey vee fos Ay = 9 ene l,. 


Therefore the following important relation”® holds: 


Am+p== 4m4p- (38) 


29 Here Ap is the minor of order p in the top left-hand corner of H. 


196 XY. Tue PrRoBLem or RovutH-HurRWITZ AND RELATED QUESTIONS 


The formula (38) holds whenever the numbers fo, f:, ... and go, 91, ... 
are well defined, i.e., under the conditions 4,1 +0, 4m 0. 
The formula (37) has a meaning if in addition to the conditions 4,,_, 0, 
A,, + 0 we also have Am45-.. 349. From this condition it follows that the 
denominator of the fraction on the right-hand side of (37) is also different 
from zero: 4,_1 5 0. 


§ 7. Orlando’s Formula 


1. In the discussion of the cases where some of the Hurwitz determinants 
are zero we shall have to use the following formula of Orlando [294], which 
expresses the determinant 4,_, in terms of the highest coefficient a) and the 
roots 21, Z2,..-, 2n of f(z) :*° 

n(n—)) 1, ..., ® 


41=(-1) ® at IT (2,42). (39) 
(<k 
For n= 2 this reduces to the well-known formula for the coefficient 6, 
in the quadratic equation doz” + boz + a, = 0: 
A, = by= — A (2% + 2y)- 


Let us assume that the formula (39) is true for polynomials of degree n, 
f(z) = age" + bye} 4+ --- and show that it is then true for polynomials of 
degree n + 1 


F (z) =(2 + A) f (2) 
= agatt} + (By + hay) 2" + (ay + bby) 2°! + (h =— 2n+1)- 


For this purpose we form the auxiliary determinant of order n + 1 


ao a) ° an-1 —hr—i 

O bg --+ Omg be a,—0 for k> Be 
D= 0 ao ° aAn—2 — hr n—1 

i ot as ola prem lel cag, Goce eee b.=0 for k> "5 | 

0 O . (1) 


30 The coefficients of f (2) may be arbitrary complex numbers. 


§ 7. OrnLANDO’s ForMuLA 197 


We multiply the first row of D by a and add to it the second row multi- 
plied by — bo, the third multiplied by a,, the fourth by — },, ete. Then 
in the first row all the elements except the last are zero, and the last element 
is f(h). Hence we deduce that 


D= (— 1)* A,-af (h) . 
On the other hand, when we add to each row of D (except the last) the 


next multiplied by h we obtain, apart from a factor (— 1)", the Hurwitz 
determinant A, of order n for the polynomial F(z) : 


bo thay b,+ha,... 


ao a, + hb, givens 
D=(—1)*" : bo + hay --. =(—1)" 4. 
a 


°° @e@© 8 ee ee «© e@© ®  e@ ee eo 


Thus 
A, = A,:af (h) =AyA, 1 IT (h—z,). 
i=] 
When we replace 4,_; by its expression (39) and set h = — 2,41, we obtain 
(M412 1 ndt 
A°=(—1) * a OT (&+%). 
t<k 


Thus, by mathematical induction Orlando’s formula is established for 
polynomials of every degree. 

From Orlando’s formula it follows that: 4,_, =0 tf and only tf the sum 
of two roots of f(z) 1s zero. 

Since 4, =cA,—-1, where c is the constant term of the polynomial f(z) 
(c= (—1)"ap2122...2n), it follows from -(39) that: 


n(n+1) 1,...19” 
A,=(—1). 7 — atzyzgee+2, if  (% + 2). (40) 
t< 


The last formula shows that: 4, vanishes if and only if f(z) has a pair 
of opposite roots < and — 2. 


31 In particular, 4,,_, = 0 when f(z) ‘has at least one pair of conjugate pure imaginary 
roots or multiple zero roots. 


198 XV. THE PRoBLeM or RouTtTH-HvuRWITZ AND RELATED QUESTIONS 


§ 8. Singular Cases in the Routh-Hurwitz Theorem 


In discussing the singular cases where some of the Hurwitz determinants 
are zero, we may assume that 4, ~0 (and consequently 4,_, 540). 

For if 4, =0, then, as we have seen at the end of the preceding section, 
the real polynomial f(z) has a root z’ for which — 2’ is also a root. If we set 
f(z) = F(z) + Fe(z). where 

F, (2) =aqz" + ayz™—2 +--+, Fe (z) = Bgz"—! + B,2*-3 + ---, 
then we can deduce from f(z’) =f(—2’)=0 that F,(2’) = Fe(z2’) =0. 
Therefore 2’ is a root of the greatest common divisor d(z) of the polynomials 
F,(z) and Fo(z). Setting f(z) =d(z)f*(z), we reduce the Routh-Hurwitz 


problem for f(z) to that for the polynomial f*(z) for which the last Hurwitz 
determinant is different from zero. 


1. To begin with, we examine the case where 


A, =...= A,=), Ap 41 + 0, 88 Sc% A, + 0. (41) 


From A, = 0 it follows that }.=0; from 42> : a =— a,b, = 0 it 
0 1 
follows that b, =0. But then we have automatically 
0 2, b 
As==|% a, a@,| = — ab? = 0 
0 0 A, 
From 
0 0 Bb bs 
—|% 9% % 4) = 
A=lo 0 0 8/7 =O 
0 dy a 4, 
it follows that 6, =0 and then A; = — a2b? = 0, ete. 
This argument shows that in (41) p is always an odd number p = 2h — 1. 
Then by = 6; = be=...=ba_1 = 0, 0,540, and* 
h(h+}) h(h+1) 


4y41= Ao, =(—1) 7 ahh, Ante = Aenea =(— 1) 7% aghytt=Aparbdn. (42) 


Let us vary the coefficients bo, b1, ..., b,—1 In such a way that for the 
new, slightly altered values bo*, b1*, ..., by. all the Hurwitz determinants 
A,*, 4e*,..., 4n* become different from zero and 4,41,..., 4n* keep their 
previous signs. We shall take b,*, b:*,..., b:41 a8 ‘small’ values of differ- 
ent orders of ‘smallness’; indeed, we shall assume that every b;~1 is in abso- 

hyl 

32 From (42) it follows that for odd h sign Apso (—1) ae sign a, and for even 


A 
h sign 4,,,= (—1)?. 


§ 8. SincuLar Cases in RoutH-Hurwitz THEOREM 199 


lute value ‘considerably’ smaller than b,* (7=1, 2,...,h; b,*—=b,). The 
latter means that in computing the sign of an integral algebraic expression 
in the b,* we can neglect terms in which some 0,* have an index less than 
7 10 comparison with terms where all the b,* have an index at least 7. 
We can then easily find the ‘sign-determining’ terms of A}, 43, ..., 4d} 
(p= 2h —1) :*8 


Aj=b5» 4g=—4 9b, + +++, Ap —ayb)? + +++, Ay=— ars? 4+..., 


A,=—arb +--+, Ag=agbs t---, 
etc.; in general, ace 
i(j+1) 
a 2 spt G— 
4,,=(—1) ahbs? + bans (=1, 2, ceey h—1), (43) 
7(+1) 
Aryjg=(—l) 7 afartt+... (j7=0,1, ..., h—1). 
We choose bo*, 5:*,...., bon—1 aS positive; then the sign of A,” is determined 
by the formula 
iG+1) : 
sign A;-—=(—1) ,?_ signal, (i=|[51. ——) oer ?). (44) 


In any small variation of the voefficients of the polynomial the number 
k remains unchanged, because f(z) has no roots on the imaginary axis. 
Therefore, starting from (44) we determine the number of roots in the 
right half-plane by the formula 


As A A 
k=V (0, Ai; —, oerely aa 3) + V7 (7? cee i*\ (45) 
4, A 1 


\ 


An elementary calculation based on (42) and (44) shows that 


° = =2h—1 
» As 4n41 Spre 1—(—1jte{  * 

ee hie 4p er \|= a : A 46 
r( » A, a.’ » A » A h+ 2 e = sign (ay 5°?) ( ) 


P p+1 
Note that the value on the left-hand side of (46) does not depend on the 
method of varying the coefficients and retains one and the same sign for 
arbitrary small variations. This follows from (45), because k does not 
*hange its value under small variations of the coefficients. 


43 Essentially the same terms have already been computed above for 4,, 4,,...,4,. 


900 XV. THE ProBLeM or RoutH-HuRWITZ AND RELATED QUESTIONS 
2. Suppose now that for s > 0 


Aggy 8 = Ary = 9 (47) 


and that all the remaming Hurwitz determinants are different from zero. 

We denote by @, a, ... and bo, bi, ... the elements of the (s +1)-th 
rows in Routh’s scheme (4 = A,/A,-1, bo = 4541/4s). We denote the cor- 
responding determinants by 4), 4o,..., An—s. By formula (38) (p. 195), 


Asgy = MA ++ Astp =A Ap Asrp41 = A Apr Asepr2=4:4p+2- (48) 


Then by 1. it follows that p is odd, say p = 2h — 1.** 

Let us vary the coefficients of f(z) in such a way that all the Hurwitz 
determinants become different from zero and that those that were different 
from zero before the variation retain their sign. Since the formula (46) is 
applicable to the determinants A, we then obtain, starting from (48) : 


r( 4, Ais | Asepes zie) 
A,_1 A, A,ip A, p41 
p—2h—l1, 
=Ah+ 1—(—I)re ( Ay Apinys\'\, (49) 
2 €= sign (45 artes) 
s—1 Ae4-p41 
ds Ae A, Asti eA (Gees An 
k=V (40> 4 — a; i (Gt. ee gee) ge a) 


The value on the left-hand side of (49) again does not depend on the method 
of variation. 


3. Finally, let us assume that among the Hurwitz determinants there are 
» groups of zero determinants. We shall show that for every such group 
(47) the value on the left-hand side of (49) does not depend on the method 
of variation and is determined by that formula.** We have proved this 
statement for v=1. Let us assume that it is true for y—1 groups and 
then show that it is also true for vy groups. Suppose that (47) is the second 
of the v groups; we determine 4), A. in the same way as was done under 2.; 
then for this variation 


go ain ee 
34 In accordance with footnote 32, for p= 2h —1 and odd hk, 


| ket 
sign 4 sip42 == (—1) 2 signd,_4; 


and for even h, 
a 
sign Ast pti (—1)2 sign A, ° 
35 From (47) and 4; £0,464 941 #0 it follows by (48) and (42) that 4,_, ~ @, 
A+ pt+2% 0. 


§ 9. QuapRaTic Forms. NuMBER OF ReEau Roots or POLYNOMIAL 201 


A; Ay ~e ~a Ae 
v( Se gts )= v(a fe hone 7). 
As_1 4,1 . y An—s—1 
Sinee we have only vy — 1 groups of zero determinants on the right-hand side 
of this equation, our statement holds for the right-hand side and hence for 
the left-hand side of the equation. In other words, the formula (49) holds 


for the second, ..., v-th group of zero Hurwitz determinants. But then it 
follows from the formula ~ 
nae A A* 
k=V (05, 2, a cece ee 
A, 4Sa-1 
y, A A 
that the value of V (42 . Ae, sen Ls Arpt) does not depend on the 


method of variation for the first group of zero determinants, and therefore 
that (49) holds for this group as well. ' 
Thus we have proved the following theorem : 


TurorEeM 5: If some of the Hurwitz determinants are zero, but A, ~0, 
then the number of roots of the real polynomial f(z) wm the right half-plane 
is determined by the formula 


A A 
k= V (% 4G? seks a.) 


in which for the calculation of the value of V for every group of p successive 
zero determinants (p is always odd!) 


(A, #0) Aygy 99 - =HAgyp=9 (Ast p41 ¥ 0) 
we have to set 


A, As41 qe) =A; 1— (— 1)*e 
ee see oe 
where*® 
p=2h—1 and e=sign (+: A, qe). 
A,—1 As4p+1 


§9. The Method of Quadratic Forms. Determination of the 
Number of Distinct Real Roots of a Polynomial 


Routh obtained his algorithm by applying Sturm’s theorem to the computa- 
tion of the Cauchy index of a regular rational fraction of special type (see 
formula (10) on p. 178). Of the two polynomials in this fraction—numera- 


86 For s== 1 ran is to be replaced by 4,; and for s=0, by a. 


202 XV. THE PrRoBLemM or Routn-Hurwitz AND RELATED QUESTIONS 


tor and denominator—one contains only even, the other only odd powers of 
the argument z. . 
In this and in the following sections we shall explain the deeper and 
more comprehensive method of quadratic forms, due to Hermite, in its 
application to the Routh-Hurwitz.problem. By means of this method we 
shall obtain an expression for the index of an arbitrary rational fraction 
in terms of the coefficients of the numerator and denominator. The method 
of quadratic forms enables us to apply the results of Frobenius’ subtle inves- 
tigations in the theory of Hankel forms (Vol. I, Chapter X, § 10) to the 
Routh-Hurwitz problem and to establish a close connection of certain re- 
markable theorems of Chebyshev and Markov with the problem of stability. 


1. We shall acquaint the reader with the method of quadratic forms first 
in the comparatively simple problem of determining the number of distinct 
real roots of a polynomial. 

In the solution of this problem we may restrict ourselves to the case 
where f(z) is a real polynomial. For suppose that f(z) =w«(z) + w(z) is 
a complex polynomial (u(z) and v(z) being real polynomials). Tach real 
root of f(z) makes u(z) and v(z) vanish simultaneously. Therefore the 
complex polynomial f(z) has the same real roots as the real polynomial d(z), 
the greatest common divisor of u(z) and v(z). . 

Thus, let f(z) be a real polynomial with the distinct roots a1, a2, ..., ay 
of the respective multiplicities n;, ne, ..., %¢: 


f (z) = ag (2 —@,)™ (2 — aq)" —T (z—a,)! 


(490; @ Aa, for ik; t,k=1,2,...,9). 


We introduce Newton’s sums 


With these sums we form the Hankel forms 


n—l 
S,, (x, x) Se ties ’ 
ik= 


where ® is an arbitrary integer, n > q. 

Then the following theorem holds: 

THEOREM 6: The number of all the distinct roots of f(z) 1s equal to the 
rank, and the number of all the distinct real roots to the signature, of the 
form 8,(2, 2). 


§ 9. Quapratic Forms. Numser or Real. Roots or Potynomiau 203 


Proof. From the definition of the form S,(z, x) we immediately obtain 
the following representation : 


q 
S,, (x, x) =>" N; (Lo + & ry + Os te Eves a %,_4)*. (51) 
j= 


Here to each root a, of f(z) there corresponds the square of a linear form 
Zy= Xo + aay t+... + at-*ay_1 (G7=1,2,..., q). The forms Z;,Zo,...,2, 
are linearly independent, since their coefficients form the Vandermonde 
matrix || a/* | whose rank is equal to the number of distinct ay, 1.e., to q. 
Therefore (see Vol. I, p. 297) the rank of the form S,,(z, x) is q. 

In the representation (51) to each real root a, there corresponds a posi- 
tive square. To each pair of conjugate complex roots a; and a; there cor- 
respond two complex conjugate forms: 


Z;=P;+1Q;, 2,=P;— 10); 


the corresponding terms in (51) together give one positive and one negative: 
square : 

nf + nf} =2n,Ph— 2m 
Hence it is easy to see®’ that the signature of S,(z, x), ie., the difference 
between the number of positive and negative squares, is equal to the number 


of distinct real a;. | 
This proves the theorem. 


2. Using the rule for determining the signature of a quadratic form that 
we established in Chapter X (Vol. I, p. 303), we obtain from the theorem the 
following corollary: 

CoroLuary: The number of distinct real roots of the real polynomial 
f(z) is equal to the excess of permanences of sign over variations of sign in 
the sequence 


| Ti) 33 eee 8 y—1 
89 8} 8 86 eee 8, 
1, 8, ; : ; (52) 
31 Se e e e e ° e e 
Bn_1  9n +--+ S2n—-2 


where the sp (p=0,1,...) are Newton’s sums for f(z) and n ts any wnteger 
not less than the number q of distwnct roots of f(z) (in particular, n can be 
chosen as the degree of f(z)). 


37 The quadratic form S,(z, 2) is representable as an (algebraic) sum of gq squares of 
the real forms Z; (for real a;) and P; and Q; (for complex a;). These forms are linearly 
independent,.since the rank of Sa(x, 2) is q. 


904 XV. Tue ProsiemM or RoutH-Hurwitz AND RELATED QUESTIONS 


This rule for determining the number of distinct real roots is directly 
applicable only when all the numbers in (52) are different from zero. How- 
ever, since we deal here with the computation of the signature of a Hankel 
form, by the results of Vol. I, Chapter X, § 10, the rule with proper refine- 
ments remains valid in the genera. case (for further details see § 11 of that 
chapter). 

From our theorem it follows that: All the forms 


S, (2,2) (n= ¢,q¢+1,...) 


have the same rank and the same signature. 

In applying Theorem 6 (or its corollary) to determine the number of 
distinct real roots, we may take n to be the degree of f(z). 

The number of distinct real] roots of the real polynomial f(z) is equal to 


the index ce f a (see p. 175). Therefore the corollary to Theorem 6 gives 
the formula 
of) $1 | 
rel®_, ep i 89 8, 81 8, ... 8, 
— f(z) ee be Me | e Hes 
Boog Bee ais Bogs ce 


F 
where s,= 3 n,a; (p=0, 1, ...) are Newton’s sums and n is the degree 
jot 
of f(z). 
In § 11 we shall establish a similar formula for the index of an arbitrary 
rational fraction. The information on infinite Hankel matrices that will be 
required for this purpose will be given in the next section. 


§ 10. Infinite Hankel Matrices of Finite Rank 
1. Let 


So, $1, S2, aah te 


be a sequence of complex numbers. This determines an infinite symmetric 
matrix 
89 84 84 eooe 
8; 8— 8... 
sga=)ji “2 “8 , 
8g 8 8 +p 


§ 10. INFintre HANKEL Matrices oF FINITE RANK 205 


which is usually called a Hankel matriz. Together with the infinite Hankel 
matrices we shall consider** the finite Hankel matrices S, = | S44 | wi 
and their associated Hankel forms 


n—l 
S, (z, 2) = Seite: 


The successive principal minors of S will be denoted by D,, D2, Ds, ... 


Deeg pH 1 2iecs). 


Infinite matrices may be of finite or of infinite rank. In the latter case, 
the matrices have non-zero minors of arbitrarily large order. The following 
theorem gives a necessary and sufficient condition for a sequence of numbers 


So, $1, $c, ... to generate an infinite Hankel matrix S= | Stak | = of finite 
rank. 

THEorEM 7: The infinite matriz S= || 8i+x || ? ts of finite rank r if and 
only tf there exist r numbers ay, a2, ..., ar Such that 

t= Dy Peg (qq=r,r+1,...) (53) 
ga 

and r is the least number having this property. 

Proof. If the matrix S= | Sink Ilo has finite rank r, then its first 
r+l rows R,, Re,..., R-41 are linearly dependent. Therefore there exists 


a number h = rsuch that Ri, Ro,..., &, are linearly independent and Ry+; 
is a linear combination of them: 


h 
Rasa =a OB, gt: 


We consider the rows Ryi1, Rogie, ..-, Reinyi, where gq 18 any non- 
negative integer. From the structure of S it is immediately clear that the 
rows Rosi, Rgs2, ..-, Rgtn41 are obtained from Ri, Reo, ..., Ra+1 by a 


‘shortening’ process in which the elements in the first g columns are omitted. 
Therefore 


A 
Rosnsi =a % Rerr—gii (g=6,1, 2, ...). 


Thus, every row of 9 beginning with the (h + 1)-th can be expressed linearly 
in terms of the A preceding rows and therefore in terms of the linearly 


88 See Vol. I, Chapter X, § 10. 


2906 XV. THE PRoBLEM OF RouTH-HuRWITZ AND RELATED QUESTIONS 


independent first h rows. Hence it follows that the rank of § is r=h,* 
The linear dependence 


h 
Rotacs = Ay Rotn—g+1 
g= 


after replacement of A by r and written in more convenient notation 
yields (53). 

Conversely, if (53) holds, then every row (column) of S is a linear com- 
bination of the first * rows (columns). Therefore all the minors of S whose 
orders exceed r are zero and S is of rank at most r. But the rank cannot. be 
less than r, since then, as we have already shown, there would be relations 
of the form (53) with a smaller value than r, and this contradicts the second 
condition of the theorem. The proof of the theorem is now complete. 


Corotuary: If the infinite Hankel matric S= | Sian Wo is of finite 
rank r, then 


D, =| S42 9 0. 


For it follows from the relations (53) that every row (columh) of S is 
a linear combination of the first r rows (columns). Therefore every minor 
of § of order r can be represented in the form aD,, where a is a constant. 
Hence it follows that D, 0. 


Note. For finite Hankel matrices of rank r the inequality. D, 40 need 


3 
not hold. For example S2= is | for Ss) = 38, =0, s2~Ois of rank 1, 


. whereas D; = 8s, = 0. 
2. We shall now explain certain remarkable connections between infinite 
Hankel matrices and rational functions. 
Let 
—_ 9 2) 
R(z) = h(z) 
be a proper rational fractional function, where 


h (z) =age™ +75 + Gm (Ag), 9 (2) = D2" + Bye” + + +> + Dy, 
We write the expansion of R(z) in a power series of negative powers of z: 


= 9) eg 1 
R@=4Q=7tatet: 
seta eee 

39 The statement ‘The number of linearly independent rows in a rectangular matrix is 
equal to its rank’ is true not only for finite rows, but also for infinite rows. 


§ 10. Inrinrre HaNnKeu Matrices or Finite Rank 207 


If all the poles of R(z),i.e., all the values of 2 for which F(z) becomes infinite, 
lie in the circle |2|<a, then the series on the right-hand side of the 
expansion converges for |z| >a. We multiply both sides by the denomi- 
nator h(z): 


(agz™ + ayz™ 1 +--+ + 4,,) (+4434 +s) = 5,21 +. baz 2 4 wee 4 BL 


Equating coefficients of equal powers of z on both sides of this identity, 
we obtain the following system of relations: 


G8 5, 
= bo, 
een (54) 
AgSm—1 + y8m—2 + °° + Aq —189 = 5, 
By8q + Ay8y_4 + °° + Ay8y_» — 9 (q=m,m+1,-...). (54’) 


Setting 
a=— 2 (g=1,2,..., m), 
we can write the relations (54’) in the form (53) (for r==m). Therefore, 
by Theorem 7, the infinite Hankel matrix 


S=|ley, le 


formed from the coefficients so, $1, Se, ... is of finite rank (=m). 

Conversely, if the matrix S= || s,,x ||° is of finite rank r, then the rela- 
tions (53) hold, which can be written in the form (54’) (for m=r). Then, 
when we define the numbers };, ba, ..., Wm by the equations (54) we have the 
expansion 


betes wa a ., 
ag” agPT pba, et eT 
The least degree of the denominator m for which this expansion holds is 
the same as the least integer m for which the relations (53) hold. By Theo- 
rem 7, this least value of m is the rank of S= || +42 ||°°. 


Thus we have proved the following theorem: 


THEorEeM 8: The matric S= || s+. |\° is of fimte rank if and only tf 
the sum of the series 
sg. 


R@=2+ ay Saye 


a3 @ rational function of z. In this case the rank of S is the same as the 
number of poles of R(z2), counting each pole with its proper multiplicity. 


208 XV. THE ProsLem or RourH-Hurwirz AND RELATED QUESTIONS 


§ 11. Determination of the Index of an Arbitrary Rational Fraction 
by the Coefficients of Numerator and Denominator 


1. Suppose given a rational function. We write its expansion in a series 
of descending powers of z:*° 


R()=8_y gute tsegetert 2+ ater. (55) 
The sequence of coefficients of the negative powers of z 


89; $3, Ss, eee 


determines an infinite Hankel matrix S= || 541 Ne 
We have thus established a correspondence 
R(z) ~ 8. 


Obviously two rational functions whose difference is an integral fune- 
tion correspond to one and the same matrix S. However, not every matrix 
S= || si+x ||° corresponds to some rational function. In the preceding 
section we have seen that an infinite matrix S corresponds to a rational 
function if and only if it is of finite rank. This rank is equal to the number 
of poles of R(z) (multiplicities taken into account), ie., to the degree of 
the denominator f(z) in the reduced fraction g(z)/f(z) =R(z). By means 
of the expansion (55) we have a one-to-one corespondence between proper 
rational functions R(z) and Hankel matrices S = | Stax le of finite rank. 

We mention some properties of the correspondence: 


1. If R,(z) ~ 81, Re(z) ~ Se, then for arbitrary numbers cj, Ce 
c1R,(z) + CoR2(z) ~~ (181 + CoSo. 


In what follows we shall have to deal with the case where the coefficients 
of the numerator and the denominator of R(z) are integral rational functions 
of a parameter a; F# is then a rational function of z anda. From the expans 
sion (54) it follows that in this case the numbers So, 1, S2, ..., i.e., the ele- 
ments of S, depend rationally on a. Differentiating (55) term by term 
with respect to a, we obtain: 


oR 
2. If R(z, a)~ S(a), then = ott 


40 The series (55) converges outside every circle: (with center at z—0) containing all 
the poles of R(z). 
8 3 
41 Tf $=] seys ll, then 5* = StF I2. 


§ 11. DETERMINATION OF INDEX oF ARBITRARY RATIONAL FRAcTION 209 


2. Let us write down the expansion of A(z) in partial fractions: 


AD. A” Av) 
Re) =O0+ S At 4 tEceet: ease ‘ . (56) 


where Q(z) is a polynomial; we shall show how to construct the matrix S 
corresponding to #(z) from the numbers a and A. 
For this purpose we consider first the simple rational function 


It corresponds to the matrix 
Sa = || aft# ||, 

The form San(z, xz) associated with this matrix is 

n—1 
San (x, 2) = _3) aft* 2,9, = (ry + oxy ee + ah", _))?, 
‘,F=0 
If 
q AM, 
Riz) =Q (2) + Da 
j= 


then by 1. the corresponding matrix S is determined by the formula 


s= 2 A984 =|) 3A slo 
and the corresponding quadratic form is 
S,, (7, x) =S'a (Zo + Ajay tere t a Gs) 
aa 
In order to proceed to the general case (56), we first differentiate the 
relation. 


1 
zZ—@ 


~ S,=|latF |lo 
h — 1 times term by term. By 1. and 2., we obtain: 


1 1 gh-1 Sa 
(2—ajyt ~ (A—1)! dah-1 


=li(; va feenees ; a aoe =Ofort+k<h—l 


210 XV. Tue ProsLeM or RoutH-HuRWITZ AND RELATED QUESTIONS 


Therefore, by using rule 1. again we find in the general case, where R(z) 
has the expansion (56) : 


q F,) | 
Rea) ~ S= 3 (AP AP a ++ TAY ae sir) Sy (6 


By carrying out the differentiation, we obtain: 


s= | Sa? att 4 AY (+) Amie oH a) gee |e 


n—1 


The corresponding Hankel form S,(z,2) = 3) 8:42:22, is 
i, k= Q 


8;— 
S, (x. y= 540+ AP 5 t ea Ae Ferra) eo yet “on ly). 


3. Now we are in a position to enunciate and prove the fundamental 
theorem :*? 


THEOREM 9: If 
R(z)~S8 


and m is the rank of S,*? then the Cauchy index I+ R(z) ts equal to the 
signature** of the form S,(2,2) for anyn=m: 


I*+™ R(z)=0 [S, (z, x)]. 
Proof. Suppose that the expansion (56) holds. Then, by (57), 


s= ST, 


fool 


where each term is of the form 


_ a. 1, oe agen 
Ta=(Ay+ Ag ze to+* + Gj Ao go) Ses Sq = |laft*|[° (68) 


(y 
and 


F 
S, (2, 2)= > Ta; (x, 2) = 2 Ta; (2, %) + > [Vo; (x, 2) + Ts, (x, x)] 
j=1 a, real a, complex 


42 This theorem was proved by Hermite in 1856 for the simplest case where R(2) has 
no multiple poles [187}. In the general case it was proved by Hurwitz [204] (see alsc 
[25], pp. 17-19). The proof in the text differs from Hurwitz’ proof. 

43 As we have already mentioned, m is the degree of the denominator in the reduce 
representation of the rational fraction &(z) (see Theorem 8 on p. 207). 

44 We denote the signature of S,n(z,2) by o[Sna(z, z)]. 


§ 11. DETERMINATION OF INDEX OF ARBITRARY RATIONAL FRACTION 211 

By Theorem 8, the rank of the matrix 7.,, and hence of the form Ta;(x, x), 

is vy (7=1, 2,..., a) and the rauk of S,(2, z) is m= vy, But if the 
rank of the sum of certain real quadratic forms is equal to the sum of the 


ranks of the constituent forms, then the same relation holds for the 
signatures: 


o[S8, (2, 2)J= 2 o[Taj(z,2)]}+ 2 o[Ta;(x,2) + Ty, (2,2)]. (69) 


ay real a; complex 


We consider two cases separately : 
1) ais real. Under any variation of the parameters A), Ao, ..., Ay»—1 
and a in 


A, Ay A» 
Brera rears, amas 2 ee (60) 
the rank of the corresponding matrix 7, remains unchanged (~,r) ; there- 
fore the signature of 7,(z,z) also remains unchanged (see Vol. I, p. 309). 


Therefore o[ T7,(z, x) ] does not change if we set in (59) and (60): A,;=...= 
A,_-1=0 and a=0, ie. if for T. we take the matrix 


y—1 | 
0 0 0 A, 0 O 
; 
a ae : 
(y—1)! dar) . 
A, 
0 
0 


The corresponding quadratic form is equal to 


tpg tees +2,12,) for v= 285 


2A, (Zo%e—1 + 7 
Ay (2 (Xg%>—1 ees 


(s=], 9 
+ 2,_9%,) + x24] forv=2s—1, 8,00); 


912 XV. THE PrRoBLeM or RoutH-HuRWITZ AND RELATED QUESTIONS 


But the signature of the upper form is always zero and that of the lower 
form is sign A,. Thus, if a is real, then 


7 = 0, for even » 61) 

viene sign A,, for odd » 

2) a is complex. 
T(x, 2)= 3) (P, + 1Q,)?, Ta (x, x)= 3) (P, —1Q,)*, 

k=1 k=1 

where P;,, Q, (k=1, 2, ..., v) are real linear forms in the variables 2, 21, 
Xo,..+,%n,—1. Then : 
T's (2,2) + Tz(x,x)=2 3) Pi-2 D' Q. (62) 
k=1 k=1 


Since the rank of this quadratic form is 2y, the Px, OQ, (kK =1, 2,...,¥v) are 
linearly independent, so that by (62) for a complex a 


o [Ta (x, 2) + Ta (x, x)] =O. (63) 
From (59), (61), and (63) it follows that 


o[S, (#,2)}= 2) sign AY. 
( a; real ) 
vy odd 


But on p. 175 we saw that the sum on the right-hand side of this equation 
is I+~ R(z). This completes the proof. 
From this theorem we Geduce: 


Coronuary 1: If R(z) ~S=|| size ||P and m is the rank of S, then 
n—-! 
all the quadratic forms S,(2,2)= 2) 84,%,2, (n=m,m+1,...) have 
; ,k=0 
ond and the same signature. 


In Chapter X, § 10 (Vol. I, pp. 343-44) we established a rule for comput- 
ing the signature of a Hankel form; moreover, Frobenius’ investigations 
enabled us to formulate a rule that embraces all singular cases. By the 


45 Each of the products 224, 112v-2,... can be replaced by a difference of squares 
Sere eee 
a aa ey as en ee es aR a a 


All the squares so obtained are lincarly independent. 


§ 11. DETERMINATION OF INDEX OF ARBITRARY RATIONAL FRACTION 2133 


theorem above we can apply this rule to compute the Cauchy index. Thus 
we obtain : 
CoROLLARY 2: The index of an arbitrary rational function R(z) whose 


corresponding matriz S=|| si+x |)” ts of ramk m, is determined by the 
formula 

It? R(z)=m— 2V (1, D,, Digsciicas Dn), (64) 
where 


89 8) eee Spy 


8} Sq. Sf 


D; = Seip ‘t% = (f =], 2, sa wy m) ; (65) 


Sy_y Bf «+. Sarg 
if among D,, Do, ..., Dm there ts a group of vanishing determinants*® 
(DiA9) Day"? =Darp=9  (Darpr 9), 


then in the computation of ~ via » +> Darpti) we can take 


sign Duy =(—-D) ” sign D, (j=1, 2, ..., Dp) 
and this gives 
V(Dys Days» +1 Datpta) 


l 
a on - (66) 
P+ 5 for even p and e= (— yi? : sign —~? po 


In order to express the index of a rational function in terms of the 
coefficients of the numerator and denominator we shall require some addi- 
tional relations. 

First of all, we can always represent R(z) in the form*’ 


g (2) 
R= (2) + 5? 


where Q(z), g(z), h(z) are polynomials and 
h (z) =agz™ + ay2"—-1 + +++ + a, (Ap FO), g (z) =Hgz™ + By2™-2 4 --+ + 5,. 


Obviously, 


oo oo 9 (2) 


46 Here we always have Du +0 (p. 206). 


47 It is not necessary to replace #(z) by a proper fraction. For what follows it is 
sufficient that the degree of g(z) does not exceed that of h(z). 


914 XV. THE PROBLEM or RouTu-Hurwitz AND RELATED QUESTIONS 


Let 


) 


g 
h 


(2) 
(z 


=syt2+3ten, 


If we now get rid of the denominator and then equate equal powers of z on 
the two sides of the equation, we obtain: 


° Ass = bo, 


Ay8q + 4481 =5,, 


se e«© & @ @ @ oe @ 


Dy Sm—1 + 2y8q_9 + °°* + Og S_y =6,,, 
gS, + O81 too + An 8,_ =O (t= Mm, m+ | ae 


(87) 


Using (67), we find an expression for the following determinant of order 
2p im which we put a,=0, b,=0 for 7 > m: 


Ag Gy Gy .-- Agy 1 
bo b; b. <0 @ Ogy-1 
0 Gd G4... Ags 
0 dba ob... Og 
8o_1 

p (p-1) : 
=(-1) *  agej?-* 

8» 


ao ay. - Bgp_) 
by by... Bap a 
Vy = 0 ao - + Agog 
0 by... Bags 


ee @ ee o®  @®  @ @ 


ee e e© oe« e#  e#  @ @ 


0 0...0 
1 89 8... Sgp_s 
1 0 ...0 
8_1 50 Sgn_8 
S32 89 | 


Then (68) can be written as follows: 


Vip=ay?D, (p=1,2,...). 


..; 4, =6, =0 for 7>m). 


Ap G2, Ug-.-- Agr} 
0 ao a, J Qap—2 
0 0 Go... Agn_3 
0 0 0O...a, | 
e $51 

.8 


(69) 


(68’) 


By this formula, Corollary 2 above leads to the following theorem : 


§ 11. DETERMINATION OF INDEX OF ARBITRARY RATIONAL FRACTION 215 
TurorEM 10: If Vy > 0,* then 


} om | oes 4 


where Vo (p=1, 2,..., m) 1s determined by (69) ; of there 1s a group of 
zero determinants 


(VanA0) Vars=--=Varrep=90 Varrsp+s 9), 


then in computing V (Vo,, Verse, - - +» Vensepre) we have to set: 


4-1) 
3 


sign V4.4, =(— 1) sign V ,, a= 1, 25255 D) 


or, what is the same, 


pte for odd p 
VV ar» eee V ansap+2) = ‘ y 
for even p and e= (—I)9 sign 57222, 
2 
Note. If V,,,A~ 9, ie. if the fraction under the index sign in (70) is 
reducible, then (70) must be replaced by another formula 


pt+ti—e 
2 


byzm + Byem—-1 4... $d : 
+ co %0 1 mee _._ 
fee a ae gs er Sep ore 2V(1,V,,0V4,...,V2,), (70°) 
where r is the number of poles (including multiplicities) of the rational 
fraction under the index sign (i.e., r is the degree of the denominator in the 
reduced fraction). 

For in this case the index we are interested in is 


r—2V (1, D,, De, ..., D,), 


since r is the rank of the corresponding matrix S = || s+ Io’ But the 
equation (68’) is of a formal character and also holds for reduced fractions. 
Therefore 


V1, D;,, Ds, ee dy D,) — VQ, Vs, Vi, pr J V s,) , 


and we have reached (70’). 

Formula (70’) enables us to express the index of every rational fraction 
in which the degree of the numerator does not exceed that of the denominator 
in terms of the coefficients of numerator and denominator. 


48 The condition V>,, 0 means that D,, 540, so that the fraction under the index 
sign in (¥0) is reduced. 


2916 XV Tue PrRoBLemM Or RoutH-HurwIi Tz AND RELATED QUESTIONS 


§ 12. Another Proof of the Routh-Hurwitz Theorem 


1. In § 6 we proved the Routh-Hurwitz theorem with the help of Sturm’s 
theorem and the Routh algorithm. In this section we shall give an alterna- 
tive proof based on Theorem 10 of §11 and on properties of the Cauchy 


indices. 
We mention a few properties of the Cauchy indices that will be required 


in what follows. 

1. I, R(2) =—eR(2).” 

2. 1° R,(x) R(x) =sign Ri(z) P?R(x) if Ri(xz) 40, © within the inter- 
val (a,b). 

3. Ifa<c<b, then bs R (x) =I. R(x) + PR (z) + 7,, where ne = 0 rf 
R(c) ts finite and ne-= +1 tf R(x) becomes infimte at c; here n-=+1 
corresponds to a jump from — o to + o atc (for tncreasing r), and yn, =—1 
toa jump from + 0 to — ow. 

4. If R(—2z) =— R(a), thenI?,R(z) =12R(z). If R(—z)=R(z), 
then I° R(2) =—I2R(z). 

Eg — 

5. (R(x) + P(1/B(2)) =—-g— where eq is the sign of R(x) within 
(a,b) near a and Ey ts the sign of R(x) within (a,b) near b. 

The first four properties follow immediately from the definition of the 
Cauchy index (see §2). Property 5. follows from the fact that the sum of 
the indices I’ R(z) and r Kis) is equal to the difference n,; — no, where 7 
is the number of times R(z) changes from negative to positive when z 
changes from a to b, and nz the number of times R(x) changes from positive 


to negative. 
We consider a real polynomial 


f (z) = gz" + ay2"~? + aga"? +--+ +a, 12+, (49> 0), 
We can represent it in the form 


f (z)= A(z’) + 2g (2), 


where 
h (wu) =a, + ang 2 mers g (u) == a,_1 at A,_3U 5 ar 


49 Here and in what follows the lower limit of the index may be — oo and the upper 
limit may be + ©. 
50 We have here reverted to the usual notation for the coefficients of a polynomial. 


§ 12. ANoTHER Proor oF RoutH-Hurwitz THEOREM 217 


We shall use the notation 


a,2"~} J2:4, Gyz"—3 + ovate 


—jte 
Q =i" Ayz" — a,z"—2 + Steven * (Th) 

In § 3 we proved (see (20) on p. 180) that 
o=n—2k—s, (72) 


where & is the number of roots of f(z) with positive real parts and s the 
number of roots of f(z) on the imaginary axis. 

We shall transform the expression (71) for @. 

To begin with, we deal with the case where niseven. Letn=2m. Then 


h (u) = agu™ + au™—} 5 casa + an, gg (u) =ayu™! + asu™* eo oe An—1° 


g () 


; . ; _ ae (u) 
Using the properties 1.-4. and setting 7 = + 1 if lim, rae + oo, respec: 


tively, and 7 = 0 otherwise, we have: 


— 72 __ 92 
p= PoP (h te t= 2 en 
= g(—2) gu) zo 94) po OU) 


__ ptoo 9 (4) oo Ug (¥) 
=ite Ue re Fe 


Similarly we have for odd n, n=2m+1: 
h (u) =a,u™ + agu™-} + --++a,, g(u)=agu™ + agu™—) + +++ +a,_4. 


Setting"! ¢= sign Eo if lim 9 (u) 0 and ¢ = 0 otherwise, we find: 


h (u)Jumo— og — 2 (u) 
pe M2) ry, pee ppg B24) pg 0 (u) 
@= Ine (mw mI + Lo +0 = 20 omy tO = ae iy to 
70 A(u) po BCU) ppt ee BCU) pr ww h(u) 
= Th ag (uy — Fm guy tO ES ag (ay — I gay 
Thus” 


51 Here we mean by sign [g(u)/h(u)],,..9—-the sign of g(u)/h(u) for negative values 
of u of sufficiently small modulus. 

52 If a, ~ 0, then the two formulas (73’) and (73) may be combined into the single 
formula 


oe +00 9 (4%) 40 A (u) aoe 
QF i (uy + = aig (uy coe 


918 XV. THE ProBLeM or RoutH-Hurwitz AND RELATED QUESTIONS 


e=irg Sh see (n=2m), (73’) 
gate FO) _ tek) (om +), (73”) 


ug(u) ~—" 9 (4) 
As before, we denote by 41, 4e2,..., 4, the Hurwitz determinants of f(z). 
We assume that 4, >< 0." 
1) n»=2m. By (70), 


14s HS = m—2V U1, y, Py, ae, a 2 (74) 
t= vO =m —2Y (1, — Ay, + 4g, — 4,-:-) 
=—m+ 27 (1, Ag, A,, we ay A,). (75) 


But then, by (73’), 
@=n—2V (1, Ay, A;,..., 4,4)—2V (1, Ne Appice AY, 
which in conjunction with 9 = n — 2k gives 
k=V(l, 4, As, ..., 4,3) + V(I, 4g, 4g, ..., 4,). (76) 


2) n=2m+1. By (70), 
h (wu) 


+e EO) am +120 (1, Ay dy.» 44), (77) 
142 A =m —2V (1, ey, ey, Pe 
= —m+2V (1, My, My .--, 4y4)- (78) 


The equation 9 = 2m + 1— 2k together with (73), (77), and (78) again 
gives (76). 
This proves the Routh-Hurwitz theorem (see p. 194). 

53 In this case s = 0, so that g ==» — 2k. Moreover, 4, 540 means that the fractions 
under the index signs in (73’) and (73”) are reduced. 

54 In computing Vy, Vy, ..., Vom the values do, di, ..., @m and bo, bi, ..., Dm must be 
replaced by do, dz, ..., dam and 0, di, Gs, ..., Gom-s respectively in computing the first index 
and by @o, da, ..., Gam 20d Gh, ds,..., Gam-1, 0 respectively in computing the second index. 

58 In computing the first index in (70) we take do, Ga,..., Gam, 0 and 0, di, ds, ...) Gam+1,. 
respectively, instead of do, di, ..., @n+2 and Do, b:,..., Om+i; and in computing the second : 
index we take ai, Ge, ..., Goavt1 ANd Ge, Ge, ..., Gom, respectively, instead of dy, Gi, ..., Gu 
and be, 01,..., Dm. ! 


§ 12. ANoTHER Proor or RoutH-Hurwitz THEOREM 219 
2. Note 1. If in the formula 
k= V(l, 4,, As,...) + V(i, Se, 44, -.) 


some intermediate Hurwitz determinants are zero, then the formula remains 
‘valid, only in each group of successive zero determinants 


(4,40) 4y2=4iyg= = Aisep=9 (Aisap+2% 0) 
the following signs must be attributed to these determinants (in accordance 
with Theorem 7) 
4-1) 
sign 4;49,=(—1) * 
which yields: 


sign A, (j=1, 2, aay p), 


P+! for odd p, 
(79) 


Asset 


V(4,, Ayre, oe) A+ op+2) = 


P 


a = for even p and ¢ =(— 1) *sign 422+? 


A careful comparison of this rule for computing & in the presence of 
vanishing Hurwitz determinants with the rule given in Theorem 5 (p. 201) 
shows that the two rules coincide.” 


Note 2. If 4,=0, then the polynomials ug(u) and h() are not co- 
prime. We denote by d(u) the greatest common divisor of g(u) and h(2) 
and by uvd(u) that of ug(u) and h(u) (y=Oorl). We denote the degree 
of d(w) by 6 and we set h(u) =d(u)hi(u) and g(u) =d(«") g:(w). 

The irreducible rational fraction g:(u)/hi(«) always corresponds to an 
infinite Hankel matrix S = | S+x || of rank r, where r is the degree of 
h,(u). The corresponding determinant D, 5@0 and D,,,;=D,4,2=...=0. 
By (68’) V,, 0,V 9.40 =Vorg=...=0. Moreover, 


Hh) Wr eV, TR ee 


When we apply all this to the fractions under the index sign in (74), (75), 
(77), and (78) we easily find that for every » (even or odd) and x= 26 + y 


A, »-1 9, A,_».%~9, A,ntt we =4,=0 


and that the formulas (74), (75), (77), and (78) all remain valid in this 
case, provided we omit all the 4, with 1 > n — x on the right-hand sides and 
replace the number m (in (77), m+ 1) by the degree of the corresponding 


56 We have to take account hero of the remark made in footnote 36 (p. 201). 


920 XV. THe Prope or RouUTH-HURWITZ AND RELATED QUESTIONS 


denominator of the fraction under the index, after reduction. We then 
obtain by taking (73’) and (73) into account: 


o=—n—x—2V (1,4), 4s, ...)—2V (1, Ae, Ma, ---)- 
Together with the formuia e =n — 2k — s this gives: 
k,= V (1, 4, 4s,..-) + V(1, 4s, 4, .- +), (80) 
where k; =k + s/2 — x/2 is the number of all the roots of f(z) in the right 


half-plane, excluding those that are also roots of f(— z).°” 


§ 13. Some Supplements to the Routh-Hurwitz Theorem. 
Stability Criterion of Liénard and Chipart 
1. Suppose given a polynomial with real coefficients 
f (z) = Qoz™ + ayz™-14+-+>+ a, (ady>0). 


Then the Routh-Hurwitz conditions that are necessary and sufficient for 
all the roots of f(z) to have neg ative real pare ean be written in the form 
of the inequalities 


A,>0,4,>0,...,4,>0, (81) 

where 

Gd; ag as .. 

Ag Gg A... 

0 ay, ay... | 

A,=| 9 Ay Oy % ‘| (a, =0 for k> n) 
a; 

is the Hurwitz determinant of order 1 (t= 1, 2,..., 7). 


If (81) is satisfied, then f(z) can be represented in the form of a product 
of a) with factors of the form z+ u, 22+ vz+w (u>0,v>0, w >0), so 
that all the coefficients of f(z) are positive :*° 


57 This follows from the fact that x is the degree of the greatest common divisor of 
h(w) and ug(u); is the number of ‘special’ roots of f(z), i.e., those roots 2* for which 
—zg* is also a root of f(z). The number of these special roots is equal to the number of 
determinants in the last uninterrupted sequence of vanishing Hurwitz determinants 
(including 4,): 4g_«41 ==*-* = 4,=9. 

58 g, > 0, by assumption. 


§ 13. SUPPLEMENTS TO RoutH-HurRwitz THEOREM. 221 
a,;>0, a,>0,.:.,a,>0. (82) 


Unlike (81), the conditions (82) are necessary but by no means suffi- 
cient for all the roots of f(z) to Jie in the left half-plane Rez < 0. 

However, when the conditions (82) hold, then the inequalities (81) are 
trot independent. For example: For n=4 the Routh-Hurwitz conditions 
reduce to the single inequality A; > 0; forn =—5, tothe two: 4, > 0, A, > 0; 
for n = 6 to the two: 4; > 0, 4; > 0.8 

This circumstance was investigated by the French mathematicians 
Liénard and Chipart® in 1914 and enabled them to set up a stability criterion 
different from the Routh-Hurwitz criterion. 


THEOREM 11 (Stability Criterion of Liénard and Chipart): Necessary 
and sufficient conditions for all the roots of the real polynomial f(z) = 
age” +a,2" 14---+4a, (a >0) to have negative real parts can be given 
im any one of the following four forms :*' 


1) a,>0, a, .>9,...; 4;>0, 4,>0,...; 
2) a,>0, a,_.>90,...; 4,>0, 4,>0,..., 
3) a,>0;4,_,>0,4,,>0,...; 4,;>9, 43>0,..., 
4) a,>0;4,_1>0;4,3>0,...; 4,>0,-4,>0,.... 


From Theorem 11 it follows that Hurwitz’s determinant inequalities (81) 
are not independent for a real polynomial f(z) = agz* + a,z*~7 +---+a, 
(a) > 0) in which all the coefficients (or even only part of them: Gn, da—2, 

. OF A,,G__1,%,-3,---) are positive. In fact: If the Hurwitz determinants 
of odd order are positive, then those of even order are also positive; and 
vice versa. | 

Liénard and Chipart obtained the condition 1) in the paper [259] by 
means of special quadratic forms. We shall give a simpler derivation of the 
condition 1) (and also of 2), 3), 4)) based on Theorem 10 of § 11 and the 
theory of Cauchy indices and we shall obtain these conditions as a special 
case of a much more general theorem which we are now about to expound. 

We again consider the polynomials h(u) and g(u) that are connected 
with f(z) by the identity 


59 This fact has been established for the first few values of n in a number of papers 
on the theory of governors, independently of the general criterion of Liénard and Chipart, 
with which the authors of these papers were obviously not acquainted. 


60 See [259]. An account of some of the basic results of Liénard and Chipart can be 
found in the fundamental survey by M. G. Krein and M. A. Naimark [25}. 

61 Conditions 1), 2), 3), and 4) have a decided advantage over Hurwitz’ conditions, 
because they involve only about half the number of determinantal inequalities. 


922 XV. THe PRoBLEM oF RoutTH-HuRWITZ AND RELATED QUESTIONS 


f (2) =h (2*) + zg (2°). 
If n is even, n = 2m, then 


h (uw) =agu™ + agqu™-14---+a,, g(u)=au"-! + agu™-2 4 ---+4,_3; 
if 2 is odd, n= 2m + 1, then 
h (u) =a,u" + agu™-1+.---+4a,, g(u) =ayu™+ au"-!+4---+a,_1. 


The conditions a, > 0, ad,-2 >0,... (or da—1 > 0, a,_3 > 0, ...) can 
therefore be replaced by the more general condition: h(u) (or g(u)) does 
not change sign for u > 0.° 

Under these conditions we can deduce a formula for the number of roots 
of f(z) in the right half-plane, using only Hurwitz determinants of odd order 
or of even order. 


THEOREM 12: If for the real polynomial 
f (z) = a2" + ayz"-1 4+ +++ a, =h (z*) + 2G(2?) (ag > 0) 


") (or g(u)) does not change sign for u > 0 and the last Hurwitz deter- 
minant A, 0, then the number k of roots of f(z) in the right half-plane 
as determined by the formulas 


n= 2m n=2m+1 
h(u) 
‘does not |= 27 (1, A,, A A k=2V(1, 4), A A == 
change ’ 1 99 + 0 +9 — » 1» B89 ec op Lig 2 
sign — Evo 
for =2V (1, Ag, Ay eee = 2V (1, Ay Ay ee 4y-1) + : < 
a>o0 
g(u) “9 
oo 1— 
oetae & = 2 (1, Ay Ap «+> Aga) + 5 k= 20 (1, A Mien Ala 
sign ; _ — 
for —=2V (1, Ay, Ags.» dp) = = 2V (1, Ay dy +s dna) + me 
u>0 . | 
63 

ne Eo. == Sign ee ) , & =sign |Z) 84) 

meee) sciva? Oe" rellaces ( 


82 Te, h(u) 20 or h(u) SO for u>0 (g(u) 20 or g(u) S0 for u> 0). 

63 If a,~0, then e. =—signa:; and, more generally, if a.—=a;—=...== @o,—1== 0. 
Gen415 0, then ¢.. = sign Gdon41. If an-1 ~ 0, thef Eg = 8igN Gn-1/An; and, more generally, 
if a,_) = a,_,==..-=dn—2n-1 =O and dn_2y—) ¥ 0, then eg = sign ag—2u—1/Gn. 


§ 13. SuppLteMENTS TO RoutH-Hurwitz THEOREM. 223 


Proof. Again we use the notation 


te ae satis son 
—* ayz®— a,zn—2 + - 


_ Corresponding to the table (83) we consider four cases: 


1) n=2m; h(u«) does not change sign for u > 0. Then* 


oo J (UH) pt co WU) 
Fe 5 "8 Bw) = 9° 


and so the obvious equation 


p94) ye ual) 


—" ht (u) h (u) 
implies that :*° 
pro oe) __ pte UG (U4) 
—~ fy (u) — h(u) - 


But then we have from (74) and (75): 
V (1, 4,, 43, ..-)= VQ, 4s, 4y,...), 
and therefore the Routh-Hurwitz formula (76) gives: 
k=2V (1, 4,, 4g,..., 4,1) = 2V (1, 4g, 4y,..-, 4,)- 
2) n=2m; g(u) does not change sign for u > 0. In this case, 


took (U) pp 00 A(u) 
tO Gu) 70 ug (w) 


h.(u) h(u) 
Tom g(a) tm ag (aj = 9+ 


so that with the notation (84) we have:. 


=0, 


+e h (w) + oo At) fs) 
oo 7) te — ug (u) —&=0. (85) 
When we replace the functions under the index sign by their reciprocals, 
then we obtain by 5. (see p. 216): 


oo J (%) oo Ug (u) __ 


64If A(t) =0 (41> 0), then g(t) ~0, because 4,40. Therefore h(u) 20 
(u > 0) implies that g(w)/h(u) does not change sign in passing through u = t:: 
65 From 4, = 4, 4,_15 0 it follows that h(0) = an 9 0. 


994 XV. THE PROBLEM or RoutH-Hurwitz aND RELATED QUESTIONS 
But this by (74) and (75) gives: 


V(1, dy, 4g...) —V(1, Ay, 4g, ...) = "2S. 


Hence, in conjunction with the Routh-Hurwitz formula (76), we obtain: 


k=2V (1, Ay, 4y,...) + “3-* = 20 (1, Ay, 4,,...) —“2 
3) n=2m-+1, g(uw) does not change sign for u > 0. 
In this case, as in the preceding one, (85) holds. When we substitute 
the expressions for the indices from (77) and (78) into (85), we obtain: 


1—€ 


V(1, A,,4 8° ca)= V (i, Aa, 44, -- j= 
In conjunction with the Routh-Hurwitz formula this gives: 


1—é 


k=2V (1, Ay, dg...) — 5? =2V (1, A, dy, ...) + 


4) n=2m+1, h(u) does not change sign for u > 0. 
From the equations 
oo 9 (%) zoo &g(w) © g (u) uw g(t) __ 
I, h(w) =I) 5 h (x) =0 and I<. iy t I= ay 
we deduce: 


wo I(t) pte Ug (a) _ 
Fook (u) t= ay = 


Taking the reciprocals of the functions under the index sign, we obtain: 


h(w) 
ug (uw) 


—* g (u) +182 
Again, when we substitute the expressions for the indices from (77) and 
(78), we have: 


V (1, Ay, As, <8 -) ca V(1, 4s, A,, “"s -) =+S, 


From this and the Routh-Hurwitz formula it follows that: 


b=2V (1, A, Ay...) — "GP = 20 (1, Ay dy.) + GE. 


This completes the proof of Theorem 12. 


From Theorem 12 we obtain Theorem 1] as a special case. 


§ 14. Hurwitz PotyNomiaus. STIELTJES’ THEOREM 225 
2. CoRoLLAaRY TO THEOREM 12: If the real polynomial 
f (2) =aq2" + az" 4 ++++a, (a9> 0) 
has positive coef ficients 
a,>0, a,>0, ag>0,... ,a,>0, 


and A, +0, then the number k of its roots in the right half-plane Rez > 0 
1s determined by the formula 


k=2V (1, Ay, 4s, ee = 27 (1, A, Ay oe is 


Note. If in the last formula, or in (83), some of the intermediate Hurwitz 
determinants are zero, then in the computation of V(1, 4;, 43, ...) and 
V(1, 42, 4s, ...) the rule given in Note 1 on p. 219 must be followed. 

But if 4,= 4a) =... = Ann 41 =0, An_. KO, then we disregard the 
determinants A,_,41,..., Oy in (83)** and determine from these formulas 
the number k, of the ‘non-singular’ roots of f(z) in the right half-plane, 
provided only that h(u) ~0 for u>0 or g(u) €0 for u > 0.% 


§ 14. Some Properties of Hurwitz Polynomials. Stieltjes’ Theorem. 
Representation of Hurwitz Polynomials by Continued Fractions 


1. Let 
f (z) = ag2” + a,z”-1 + +++ +a, (a9 0) 


be a real polynomial. We represent it in the form 


f(z) =h (2) + 2g (2°). 


We shall investigate what conditions have to be imposed on h(u) and 
g(u) in order that f(z) be a Hurwitz polynomial. 

Setting k=s=0 in (20) (p. 180), we obtain a necessary and sufficient 
condition for f(z) to be a Hurwitz polynomial, in the form 


e=7, 
where, as in the preceding sections, 


p= peas oe ‘ - 
= Aoz™® — A,zN—2Z +-.-- 
66 See p. 220. 


67 In this case the polynomials hi(u) and g:(u) obtained from h(u) and g(u) by 
dividing them by their greatest common divisor d(u) satisfy the conditions of Theorem 12. 


226 XV. THE PROBLEM oF RouTH-HURWITZ AND RELATED QUESTIONS 


Let n=2m. By (73’) (p. 218), this condition can be written as follows: 


— Og, — te 9 (4) oo Ug (4) 
n= om =I yy I hu) (86) 


Since the absolute value of the index of a rational fraction cannot exceed 
the degree of the denominator (10 this case, m), the equation (86) ean 
hold if and only if 


+22 ©) m ang 1+ “2 ™) om (87) 


hold simultaneously. 
For n= 2m + 1 the equation (737) gives (on account of on): 


oo +00 h (u) _yt+e h (u) 
. Io ig (w) —~" g (u) © 
When we replace the fractions under the index signs by their reciprocals 
(see 5. on p..216) and observe that h(w) and g(u) are of the same degree m, 
we obtain :*8 


a= am +1 = Ite — rte PE) 4 en. (88) 


Starting again from the fact that the absolute value of the index of a fraction 
cannot exceed the degree of the denominator we conclude that (88) holds 
if and only if 


foo 9 (4) poo Ug (u) — 
y hese h(a yeas or m and €., = 1 (89) 
hold simultaneously. 
If n = 2m, the first of equations (87) indicates that h(w) has m distinct 
real roots u; < uUe<...< Um and that the proper fractions g(u)/h(u) 
can be represented in the form 


g(u) oa OR 
(uy <— u—u;’ (90) 
where 
R, = fia > 0 (6=1,2, ..., m). (90) 


From this representation of g(u)/h(«) it follows that between any two 
roots uw, 41 of A(w) there is a real root uj of g(w) (t= 1, 2,..., m-—1) 
and that the highest coefficients of h(u) and g(u) are of like sign, 1.e., 

g (u) 


68 As in the preceding section, €. = sign xc rarest 


§ 14. Horwitz Potynomu.s. STIELTJES’ THEOREM 227 


h (u) = ay (u—,)++- (u— 4,,), g (u) =a, (wu — u) ee(u—u,)), 
Uy <Uy < thy Sg <_< Uy <n a,a,> 0. 
The second of equations (87) adds only one condition 
Um <0. 


By this condition al! the roots of h(u) and g(w) must be negative. 
If n=2m+1, then it follows from the first of equations (89) that 
h(u) has m distinct real roots wu < w2<...< Um, and that 


g(u) ae: 

hw) eat ~ oe (s_10), (91) 
where 

R= {58 >0 (¢=1, 2,..., m). (91’) 


The third of equations (89) implies that 
8_4>0, (92) 


ie., that the highest coefficients a) and a; are of like sign. Moreover, it 
follows from (91), (91’), and (92) that g(u) has m real roots vw’; < u’2<... 


< wv’, in the intervals (~ x, U1), (t1, Ue2),..-, (%m—1, Um). In other words, 
h (uw) = ay (w— Uy) 0+ (U— Up), g (u)== dp (u — Uy) -+s(w—Uy,) 5 
Us < Uy <y < Ug So LU, < Uy; aja,>0. 


The second of equations (89), as in the case n= 2m, only adds one further 
inequality 
Um <0. 


Derimition: 3. We shall say that two polynomials h(u) and g(u) of 
degree m (or the first of degree m and the second of degree m—1) forma 
positive pair®® if the roots uy, U2, ..., Um Nd Uy, Uy, .. +) Um (OF Uy, t 3, 0-5 
u;,_,) are all distinct, real, and negative and they alternate as follows: 


Uy < Uy Uy Ug SU, Uy <0 
(Or Uy << Uy < Ug << Uy <u, <0) 
and their highest coefficients are of like sign.”° 


69 Bee [17], p. 333. The definition of a positive pair of polynomials given here differs 
slightly from that given in the book [17]. 


70 If we omit the condition that the roots be negative, we obtain a real pair of poly- 
nomials. For the application of this concept to the Routh-Hurwitz problem, see [36]. 


228 XV. THE PROBLEM OF RoUTH-HURWITZ AND RELATED QUESTIONS 


When we introduce the positive numbers 1,—— u; and v,’ =— u,’ and 
multiply h(w) and g(w) by + 1 or —1 so that their highest coefficients are 
positive, then we can write the poly:.omials of this positive pair in the form 


h(u)=a, (ute), g(u)=ag]T (ute), (93) 


i=] i=] 
where 


Ay>0, a9 >0, 0< <q < Up_a < Ua << Vy <M}, 


in case both h(u) and g(x) are of degree m, aud in the form 


m m—1 
h(u) =a TT (u+), g (u) =a, J] (u+%), (93’) 


where 


A>, a,>0, 004 <n < Uma SY; 


in case h(u) is of degree m and g(u) of degree m — 1. 
By our earlier arguments we have proved the following two theorems: 


THEOREM 13: The polynomial f(z) =h(2?) + 2g(z2?) ts a Hurwitz poly- 
nomial if and only if h(u) and g(u) form a positive par.” 


THEOREM 14: Two polynomials k.(u) and g(u) the first of which rs of 
degree m and the seconu of degree m or m—1 form a positive patr tf and 
only wf the equations 


eh (uy? (uy 


hold and, when h(u) and g(u) are of equal degree, the additional condition 


Eo == sign empl FF ] (95) 


holds. 


2. Using properties of the Cauchy indices we can easily deduce from the 
last theorem a theorem of Stieltjes on the representation of a fraction 
g(u)/h(w) as a continued fraction of a special type, provided h(u) and 
g(«) form a positive pair of polynomials. 

The proof of Stieltjes’ theorem will be based on the following lemma: 


71 This theorem is a special case of the so-called “Hermite-Biehler theorem (see [7}, 
p. 21). 


§ 14. Hurwitz PoLyNomiats. STIELTJES’ THEOREM 229 


Lemma. If the polynomials h(u) and g(u) (h(u) of degree m) form 
a positive par and 


Lat ee 
(a) =O hata” (96) 
91 (4) 


where c, d are constants and hi(u), gi(u) are polynomials of degree not 
exceeding m —1, then 

1 c2Zz0,d>0; 

2. hi(u), gi(u) are of degree m—1; 

3. hi(u) and gi(u) form a positive par. 

Given h(u) and g(u), the polynomials hi(u) and gi(u) are uniquely 
determined (to within a common constant factor) and so are c and d. 

Conversely, from (96) and 1., 2., 3. tt follows that h(u) and g(u) form 
a positive paar, that h(u) rs of degree m, and g(u) 1s of degree m or m—1 
according asc >Qorc=0. © 

Proof. Let h(u), g(#) be a positive pair. Then it follows from (94) 
and (96) that: 


— pred) _ pro 
me Tm bw) I= te) 
9: (u) 


This equation implies that g,(u) is of degree m — 1 and that d ~~ 0. 
Further, from (97) we find: 


h, (u) 
gi (u) 


hy (u) 
1 (%) 


m= —Ii2|du + =| + sign d =—1t2 2 + signd. 


Hence it follows that d > 0 and that 


oo 2 
1623 =—(m—D). (98) 


The second of equations (94) now gives: 


— jto Ug (4) __ ee sie 1 
ig: (u) 
wedplcis 1 sale eae h, (u) eo Ay (u) 
nt (egg ee Rf UN) | — yt : 
gg 4 hy (u) ae la a 77 a I ug, (2%) (98) 
ug, (wu) 


Hence it follows that h,(«) is of degree m — 1. 
Condition (95) yields, by (96): c>0. But if g(«) is of smaller degree 
than h(u), then it follows from (96) that c= 0. 


230 XV. THE PROBLEM OF RoutH-Hurwitz AND RELATED QUESTIONS 


(98) and (99) imply: 


+c 91 (u) ete 


teh m1, teh mt, (100) 


~~ hy (u) 
where 


QQ)... 91 (wv 
cert, is (1) | wm toe” 


Since the second of the indices (100) is in absolute value less than m — 1. 
we have 


=I, (101) 
and then we conclude from (100) and (101), by Theorem 12, that the 


polynomials h,(u) and g,(%) form a positive pair. 
From (96) it follows that 


= lim 2 GAN) | ge 
ee! Aix lu=g 
After c and a@ nave been found, the ratio a) is determined by (96). 
1 


The relations (97), (98), (99). (100), and (101) applied in the reverse 
order, establish the second part of the lemma. Thus the proof of the lemma 
is complete. 

Suppose given a positive pair of polynomials h(u), g(u), with h(w) 
of degree m. Then when we divide g(u) by h(w) and denote the quotient 
by co and tka remainder by gi(u). we obtain: 


J(u) Ji (%) 1 
Kw) + Fay = Ot Tew’ 
9; (%) 
4) oan be represented in the form dow + 4), where hy (a), like gi(t), 
9, (%) 9, (%) 
is of degree less than m. Hence 
g (*) __ l 
h(a) OF (102) 


§ 14. Hurwitz PoLyNomiaALs. STIELTJES’ THOREM 231 


Thus, the representation (96) always holds for a positive pair h(u) and 
g(u). By the lemma 
Co = 0, do > 0, 


and the polynomials 4,(u) and gi;(u) are of degree m — 1 and form a posi- 
tive pair. 

When we apply the same arguments to the positive pair hi(w), gi(w), 
we obtain 


91 (u) a l , 
hy(u) 8S wy Cee 
: Jz (u) 
where 
¢: > 0, dy > 0, 


and the polynomials h2(u) and y2(u) are of degree m — 2 and form a posi- 
tive pair. Continuing the process, we finally end up with a positive pair 
hm aNd gm, where hm and gm are constants of like sign. We set: 


Im —¢ (102(™)) 


hm ™* 


Then it follows from (102), (102’),..., (102°™) that: 


. 1 
$A. 
deci = 
Com 


Using the second part of the lemma, we show similarly that for arbitrary 
Co = 0, C1 > 0, ..- em > 0, do > O, dy > 0, ..., Ans; > G the above con- 
tinued fraction determines uniquely (to within a common constant factor) a 
positive pair of polynomials h(«) and g(u), where h(u) 1» of degree m and 
g(u) is of degree m when cy > 0 and of degree m— 1 when cy = 0. 


Thus we have proved the following theorem.” 


72 A proof of Stieltjes’ theorem that is not based on the theory of Cauchy indices can 
be found in the book [17], pp. 333-37. 


232 XV. THE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS 


TueEorEM 15 (Stieltjes): If h(u), g(u) 1s a positive pair of polynomials 
and h(u) is of degree m, then 


(103) 


, -] 
cE a 
du + Pm 


Mm 


where 
Co = O,7 o,>0,..., ¢,, > 0, d,>0,...,d,4>0. 


Here co=0 if g(u) is of degreem—1 and co > Otf g(u,) ws of degree m. 
The constants ci, d, are uniquely determined by h(u), g(u). 

Conversely, for arbitrary co = 0 and arbitrary positwe ¢i, ..., Cm, 
“do, ..., Um—1, the continued’ fraction (103) determines a positive pair of 
polynomials h(u), g(w), where h(u) ws of degree m. 

From Theorem 13 and Stieltjes’ Theorem we deduce: 


THeEoreM 16: A real polynomial of degree n f(z) =h(2?) + 2g(2?) isa 
Hurwitz polynomial if and only if the formula (103) holds with non- 
negative Co and positive ci,..., Cm, do,...-,Am—1. Here co > 0 when nis odd 
and Cy == 0 when n 1s even. 


§ 15. Domain of Stability. Markov Parameters 


1. With every real po:ynomial of degree n we can associate a point of an 
n-dimensional space whose coordinates are the quotients of the coefficients 
divided by the highest coefficient. In this ‘coefficient space’ all the Hurwitz 
polynomials form a certain n-dimensional domain which is determined”® by 
the Hurwitz inequalities 4; > 0, 42 >0,..., 4, > 0, or, for example, by 
the Liénard-Chipart inequalities a, > 0, d,_2 > 0,..., 4; > 0, 4; > 0,.... 
We shall call it the domain of stability. If the coefficients are given as 
functions of p parameters, then the domain of stability is constructed in the 
space of these parameters. 


78 Kor & =1. 


§ 15. Domain or SraBiLity. Markov PARAMETERS 233 


The study of the domain of stability is of great practical interest; for 
example, it is essential in the design of new systems of governors.’‘ 

In §17 we shall show that two remarkable theorems which were found 
by Markov and Chebyshev in connection with the expansion of continued 
fractions in power series with negative powers of the argument are closely 
connected with the investigation of the domain of stability. In formulating 
and proving these theorems it is convenient to give the polynomial not by 
its coefficients, but by special parameters, which we shall call Markov 
parameters. 

Suppose that 


f (2) = age" + aah era, (a0) 


is a real polynomial. We represent it in the form 


f (z) == h (2) + 2g (2?). 

We may assume that h(u) and g(u) are co-prime (4,40). We expand 
the irreducible rational fraction a in a series of decreasing powers of u:"* 
g (uw) _ SO BU OR 8 es 

(uy 8-1 ai wet ua oe ie (104) 

The sequence 5, $1, Se, ... determines an infinite Hankel matrix 

S= || si+x ||°. We define a rational function R(v) by 
_ g(—?) 

R(v)=— ko) (105) 
Then so ; 
— ie Bk oi a ae 

Ro)=— sat P+ atte (106) 


so that we have the relation (see p. 208) 
R(v) ~ 8. (107) 


Hence it follows that the matrix § is of rank m= [n/2], since m, being 
the degree of h(w), is equal to the number of poles of R(v)." 
For »=2m (in this case, s_,;—0), the matrix S determines the irre- 


ducible fraction oe uniquely and therefore determines f(z) to within a 


74 A number of papers by Y. I. Naimark deal with the investigation of the domain of 
stability and also of the domains corresponding to various values of % (k is the number of 
roots in the right half-plane). (See the monograph [41].) 

75 In what follows it is convenient to denote the coefficients of the even negative powers 
af u by — %, — 8, etc. 

76 See Theorem 8 (p. 207). 


234 XV THe ProsiemM or RouTH-HURWITZ AND RELATED QUESTIONS 


constant factor. For n= 2m + 1, in order to give f(z) by means of S it is 
necessary also to know the coefficient s_1. 

On the other hand, in order to give the infinite Hankel matrix S of rank 
m it is sufficient to know the first 2m numbers So, 81, ..., Son—1. These 
numbers may be chosen arbitrarily subject to only one restriction 


Dy, =| S44 |P AO 5 (108) 


all the subsequent coefficients Som, Sam+1,.-- Of (104) are uniquely (and 
rationally ) expressible in terms of the first 2m: so, $1,..., Som—1- For in the 
infinite Hankel matrix S of rank m the elements are connected by a recur- 
rence relation (see Theorem 7 on p. 205) 


8 = 2 Oey (q=m, m+1, ...). (109) 


If the numbers 50, $1, ... , Sm~1 Satisfy (108), then the coefficients aj, ag, ..., 
Gm in (109) are uniquely determined by the first m relations; the subsequent 
relations then determine Som, Som41)--> - 

Thus, a real polynomial f(z) of degree n = 2m with 4, ~ 0 can be given 


uniquely’? by 2m numbers S$, $1, ..., Som—1 Satisfying (108). When 
n=2m-+1, we have to add s_, to these numbers. 
We shall call the 1 values So, 51, ..., Somn—1 (for »=2m) or s_1, So, ...; 


Son—1 (for n=2m+1) the Markov parameters of the polynomial f(z). 
These parameters may be regarded as the coordinates in an n-dimensional 
space of a point that represents the given polynomial f(z). 

We shall find out what conditions must be imposed on the Markov para- 
meters in order that the corresponding polynomial be a Hurwitz polynomial. 
In this way we shail determine the domain of stability in the space of Markov 
parameters. 

A Hurwitz polynomial is characterized by the conditions (94) and the 
additional condition (95) for n=2m+1. Introducing the function R(v) 
(see (105) ), we write (94) as follows: 


It*2R(v)=m, It@vR(v) =m. (110) 
The additional condition (95) for n=2m +1 gives: 
8_,>0. 
Apart from the matrix S= | Si4k | ,° we introduce the infinite Hankel 
matrix 8 = || sipe4a ||@- Then, since by (106) 


77 To within a constant factor. 


§ 15. Domain or StTaBiLity. Markov PARAMETERS 235 


vR (v) =—s_+a+—+4+2 +4. 


the following relation holds: 
vR(v) ~ §S™, (111) 


The matrix S“), like S, is of finite rank m, since the function vR(v), like 
R(v), has m poles. Therefore the forms 


m—1 m—1 
(1) 7 
S,,(z, 2) = Py S54 UsLy » S,, (z, 2) = > Sop e4 1 UUy 
i, k=0 i, k=0 


are of rank m. But by Theorem 9 (p. 190) the signatures of these forms, 
in virtue of (107) and (111), are equal-to the indices (110) and hence also 
tom. Thus, the conditions (110) mean that the quadratic forms S,,(z, x) and 
S) (a, 2) are positive definite. Hence: 


THEOREM 17: A real polynomial f(z) =h(2?) + 2g(2?) of degreen=2m 
or n=2m +1 is a Hurwitz polynomial if and only tf :"8 


1, The quadratic forms 


m—1 m—1 
Sin (x, 2) = _» Si ~UiX ys SOx, x) = 2 ‘ Sint Vibe (112) 
t, kad , om 


are posttwe definite; and 


2. (For n=2m +1) 
\ 8-1 >0. (113) 


Here s_1, So, 81, ..., Seom—1 are the coefficients of the expansion 


yatta tate 


78 We do not mention the inequality A,,~ 0 expressly, because it follows automatically 
from the conditions of the theorem. For if f(z) is a Hurwitz polynomial, then it is known 
that 4, #0. But if the conditions 1., 2. are given, then the fact that the form S) (z, z) 
is positive definite implies that 

—I** vo —it® vR (v) =m, 
and from this it follows that the fraction ug(u)/h(%) is reduced, which can be expressed 
by the inequality 4, ~ 0. 

In exactly the same way, it follows automatically from the conditions of the theorem 
that Dm =| 8+2("—14 0, ie. that the numbers %, ,..., Som-1, and (for n== 2m + 1) 
8.1 are the Markov parameters of f(z). 


936 XV. THE PROBLEM OF RouTH-HuRWITZ AND RELATED QUESTIONS 
We introduce a notation for the determinants 
Dy= |8ie 0 DO =lsrrfh > (p=1, 2,4... m). (114) 


Then condition 1. is equivalent to the system of determinantal inequalities 


8p 81 +++ Om 
$ 8 & 
D, =%>0, Dy=|° 7|>0,...D,=| * * ™ 1>0, 
8} 8, oe « ee @ @ ee 
8m—1 8m -*° Sam—2 (115) 
| 8, 8 Sy 
8, 8 8 & ~ & 
DY = s, >0, DY” | a 0,404 DO S|) 2 OO 
ye nD Oto ee eee 


8m Sm+1 8am—1 
If n=2m, the inequalities (115) determine the domain of stability in 
the space of Markov parameters. If n= 2m + 1, we have to add the further 
inequality : 
8_,>O0. (116) 


In the next section we shall find out what properties of S follow from 
the inequalities (115) and, in so doing, shall single out the special class of 
infinite Hankel matrices § that correspond to Hurwitz polynomials. 


§ 16. Connection with the Problem of Moments 


1. We begin by stating the following problem: 


PROBLEM OF MOMENTS FOR THE POSITIVE AxIs 0 << vu < 3:78 Giuven a 
SEQUENCE So, Si, ... Of real numbers, it 1s required to determine posite 
numbers 


fy > 0, Mg>O, .-., Un, Dy <ug<ee'< yy, (117) 


such that the following equations hold: 
m 
8, = 2 os (p=0, 1, 2,...). (118) 
w=] 


It is not difficult to see that the system (118) of equations is equivalent 
to the following expansion in a series of negative powers of u: 


78 This problem of moments ought to be called discrete in contrast to the general expo- 


‘ ™ 
nential problem of moments, in which the sums pyr are replaced by Stieltjes integrals 
j 


oo jal 
[v? du(v) (see [55]). 
0 


§ 16. CoNNECTION WITH PROBLEM OF MOMENTS 237 


ot 
2 Ss, eed i ad 
et =a ww ; (119) 
In this ease the infinite Hankel matrix S= || 5,4, ||" is of finite rank m 
and by (117) in the irreducible proper fraction 
g(u) _ Ss By (120) 
h (u) fi et % 


(we choose the highest coefficients of h(u) and g(u) to be positive) the 
polynomials h(u) and g(«) form a positive pair (see (91) and (91’)). 

Therefore (see Theorem 14), our problem of moments has a solution if 
and only if the sequence So, $1, Se, ... determines by means of (119) and 
(120) a Hurwitz polynomial f(z) =h(2*) + 2g(27) of degree 2m. 

The solution of the problem of moments is unique, because the positive 
numbers v, and pw, (j = 1, 2,..., m) are uniquely determined from the expan- 
sion (119). 

Apart from the ‘infinite’ problem of moments (118) we also consider 
the ‘finite’ problem of moments given by the first 2m equations (118) 


op = 3 He (p=0,1,..., 2m—1). (121) 


These relations already determine the following expressions for the Hankel 
quadratic forms: 


m—1 m 
-1 
854.4%, 2; =a BL; (2p + 40; 4 see Won UF 3, 


(122) 
m—1 m 
P $54 441 EXy = = HP; (ot LV; + 09+ + Smt)? 
Since the linear forms in the variables 2, 21, ..., Zm—1 
gt adj ose + Lm 0G (j=1,2, ..., m) 


are independent (their coefficients form a non-vanishing Vandermonde 
determinant), the quadratic forms (122) are positive definite. But then 
by Theorem 17 the numbers $9, s;, ..., Som—1 are the Markov parameters of 
a certain Hurwitz polynomial f(z). They are the first 2m coefficients of 
the expansion (119). Together with the remaining coefficients Som, Som43,.-- 
they determine the infinite solvable problem of moments (118), which has 
the same solution as the finite problem (121). 


Thus we have proved the following theorem: 


938 XV. THE PROBLEM OF RoUTH-HURWITZ AND RELATED QUESTIONS 


THEOREM 18: 1) The fimte problem of moments 
8p ms HY; (123) 


(p=0,1,...,2m—1; wr > Oie5 -»ULn > 0;0 <1 Va2<... < Um), where 
Sp are given real numbers and vj; and pj; are unknown real numbers 


(p=0,1,..., 2m—1; j= 1, 2, ..., m) has a solution tf and only tf the 
quadratic forms 

m—1 m—1 

Py 554 EUsUy » 2, 8544412, {124) 


are positive definite, 1.e., if the numbers So, 81, ... 5 Sem—1 are the Markov 
parameters of some Hurwitz polynomial of degree 2m. 


2) The infinite problem of moments 
8 = 2! MY; (125) 


(p=0,1,2,... 341 >90,..., um > 0;0 < 01 < V2 <<... < Um), where sy are 
given real numbers and v; and wu; are unknown real numbers (p=0,1,... ; 
j=1, 2,..., m) has a solution tf and only if 1. the quadratic forms (124) 
are positive definite and 2. the infinite Hankel matriz S=|| si+x |? 18 of 
rank m, 4.6., if the series 


Byes OO) (126) 


determines a Hurwitz polynomial f(z) =h(2?) =29(27) of degree 2m. 
3) Fhe solution of the problem of moments, both the finite (123) and 
the wnfinite (124) problem, is always unique. 


2. We shall use this theorem in investigating the minors of an infinite 
Hankel matrix S=|| s.4% ||" of rank m corresponding to some Hurwitz 
polynomial, i.e., one for which the quadratic form (124) is positive definite. 
In this case the generating numbers So, Ss), Se, ... of S can be represented in 
the form (123), so that for an arbitrary minor of S of order h = m we have: 


k k 
vy eee vy" | 
ky kh 
i, 4, Vg V2 
Sith, °° Si +kn HyVy Hae Hmm _ 
th th ih e @ eoeo¢ 
Sint ky, °° * Sinten MyPage +--+ Ulm ee re ee 
k k 
a 


§ 16. CONNECTION WITH PROBLEM oF Moments 


239 
and therefore 
sg ty ts ee " 
k, k, eee k,, 
y vy! vy ky ks k 
Bias oh) | Va, Va, Va" 
; ts t k 
oe ey Me, fe u Va, Va, Vay, Vas vr ok 
— eee OL 
Isa <uy co <ansm sd ee are ees * (127) 
th oth i}, k 
Va, Va, e Vap, Vap ont 2 ugh 


But from the inequalities 
0< 7 << <u, ty <ig<ee-<h, ky <kg<-ck, 


it follows that the generalized Vandermonde determinants®° 


i 4 ‘ kok k 
Va, Vo, ++ Vay Var Va, va 
te ots 45 ky ky kh 
Va, Va, ese Vay, > 0. Va, Va, ees Va, > 0 
> 
hth th k k 
Va, Vay Van, Uap vet Var 


are positive. 
Since the numbers m,; are positive (j =1, 2,..., m), it theretore follows 
from (127) that 


4, tt) ... 4 ty <i tg Sera 
s(; : ‘)>0 (0s sees * h=1,2,...,m). 12 
ky k, eee k,, ki <kg<-<hk,, m) ( 8) 
Conversely, if in an infinite Hankel matrix S= || 5,4; |@ of rank m all 
the minors of every order h S m are positive, then the quadratic forms (124) 
are positive definite. 


DEFINITION 4: An infinite matrix A= | Oux 7 will b. called totally 
positwe of rank m tf and only if all the minors of A of order hh =m are 
positive and all the minors of order h > m are zero. 

The property of S that we have found can now be expressed in the fol- 
lowing theorem :*? 

THEOREM 19: An infinite Hankel matriz S = || 5,41 ||> is totally posi- 
tive of rank m af and only 1f 1) S ts of rank m and 2) the quadratic forms 

m~1 m—1 
pa ae tad tad oa Sipe 1 Uy 
‘,F=0 i,k=0 


are positwe defuute. 


80 See p. 99, Example 1. 
81 See [173]. 


240 XV. Tne PrRosLem or RoutH-HuRWITZ AND RELATED QUESTIONS 


From this theorem and Theorem 17 we obtain: 


THEorEM 20: A real polynomial f(z) of degree n 1s a Hurwitz poly- 
nomial if and only if the corresponding infinite Hankel matrix S = 1 Si4% [Le 
is totally positwe of degree m= [n/2] and if, in addition, s_; > 0 when 
nm ws odd. 


Here the elements 8, 81, S2, ... of S and s_ are determined by the 
expansion , 
(u 8 8 8 
gaat 2-at+3—— 29) 
where 


f(z) =h (24) + 2g (2%). 


§ 17. Theorems of Markov and Chebyshev 


1. In a notable memoir ‘On functions obtained by converting series into 
continued fractions’*? Markov proved two theorems, the second of which had 
been established in 1892 by Chebyshev by other methods, and not in the 
same generality, ** 

In this section we shall show that these theorems have an immediate bear- 
ing on the study of the domain of stability in the Markov parameters and 
shall give a comparatively simple proof (without reference to continued 
fractions) which is based on Theorem 19 of the preceding section. 

In proceeding to state the first theorem, we quote the corresponding 
passage from the above-mentioned memoir of Markov :* 


On the basis of what has preceded it is not difficult to prove 
two remarkable theorems with which we conclude our paper. 
One is concerned with the determinants® 


AAs Aes AY AD hac 5A 
and the other with the roots of the equation®® 
Ym (%) = 0. 


82 Zap. Petersburg Akad. Nauk, Petersburg, 1894 [in Russian]; see also [38], 
pp. 78-105. 


83 This theorem was first published in Chebyshev’s paper ‘On the expansion in con- 
tinued fractions of series in descending powers of the variable’ [in Russian]. See [8], 
pp. 307-62. 

84 [$8], p. 95, beginning with line 3 from below. 

85 In our notation, Di, Ds, ...', Dm, pe, DY), aot p®. (See p. 236.) 

86 In our notation; h(—2) = 0. 


§ 17. THEOREMS OF Markov AND CHEBYSHEV 241 


THEOREM ON DETERMINANTS: If we have for the numbers 


Sq 84, 895 -- +» Samo» S2m—1 
two sets of values 


e 89 =, Sy =4,, Se =@,, eeey 89m—2 = Gon—g> Som—1 = Qom—1» 
2. So = bg, 8,—5,, 8 =b,, ag Bouse = Dono: Som—1 = Oem-1 


for which all the determinants 


89 81 106 8n_y 
A,=8),4,= 4 ,° »4,= “1 is om , 
8, 8) | ttt et eee 
Sm—-1 8m b2m—2 
8, 85 Sin 
8; 82 


AM =s,, 42 = eee, AM = 8p 83-6. Binsy 


3m S8m+1  Som—1 
turn out to be positive numbers satisfying the inequalities 
Ay = bo, by Say, Ay S by, bg 2 Ag, «- +, Dom _o = Dom 9) Oem 1 = Toma) 
then our determinant 


Ay, Ag, «+45 Am; AM, AP), 2.2, AM 
must be positive for all values 
89; 81, 89, @eery Som—1 
satisfying the mequalities 


Ay S 89 = bg, 6 S84 [> ay, Ag = 8g bg, . 
Dom 2 = Sm—2 = Oem 21 bem_1 = Son—1 = Fom—t° 


Under these conditions we have 


Q ay | 89 84 Sy_4 Do by... by 
ae ae % = $1 82 Sy > by by... by 
Gy. Gy -+- oye Spy 894-9 a bys 


and 


242 XV. THE PROBLEM OF RouTH-HURWITZ AND RELATED QUESTIONS 


b, bs b $1 4 3, | @, a a 
by bs beta > |82 $3 +--+ Sta || 4a 4s ad | 
by Oasys +s Does Se Se+1 +++ Sop_1 Uy, M+y +++ Ay y 


for k=1, 2,..., m. 


In order to give another statement of this theorem in connection with 
the problem of. stability, we introduce some concepts and notations. 

The Markov parameters 80, 51, ..., Somn—1 (for n= 2m) or S_), So, $1,.--; 
Som—1 (for n= 2m +1) will be regarded as the coordinates of some point P 
in an 7-dimensional space. The domain of stability in this space will be 
denoted by G. The domain G is characterized by the inequalities (115) and 
(116) ‘(p. 236). 

We shall say that a point P= {s,} ‘precedes’ a point P* = {s,*} and 
shall write P ~< P* if 


* 2 * td 
89 S89, 81 S Sy, 8p S83, 89S 8g, ---, 82m-1 S Som-1 


and (for »=2m-+1) (130) 
6438", 


and the sign < holds in at least one of these relations. 
If only the relations (130) hold, without the last clause, then we shall 


write:. 


PX Pr. 


We shall say that a point @ lies ‘between’ P and R if P ~ Q ~ R. 
To every point P there corresponds an infinite Hankel matrix of rank 
m: S= | S44k (Pag We shall denote this matrix by Sy. 


Now we can state Markov’s theorem in the following way : 


THEOREM 21 (Markov) : If two points P and R belong to the domain of 
Sabiity G and tf P precedes R, then every point Q between P and R also 
belongs to G, «e., 


from P,ReG,P< Q~< Rit follows that Q« G. 


Proof, From P X Q X R it follows that P and Q can be connected 
by an arc of a curve 


8, =(—1)' 9, (t) [2 StS 73i=0,1,...,.2m—1 and (forn= 2m+4 1)i=—1] (131) 


§ 17. THEOREMS OF MARKOV AND CHEBYSHEV 243 


passing through Q such that: 1) the functions q,(¢) are continuous, mono- 
tonic increasing, and differentiable when ¢ varies from t¢=—a to t= y; and 
2) the values a, B, y (a < 8 < y) of ¢ correspond to the points P, Q, R on 
the curve. 

From the values (131) we form the infinite Hankel matrix S= S(t) = 
| $44%(t) Ps of rank m. We consider part of this matrix, namely the rec- 
tangular matrix 


83 So cee 8 8m+1 (132) 


8m—1 bn i 8om—2 Som—1 


By the conditions of the theorem, the matrix S(¢) is totally positive of 
rank m for t= a and t= y, so that all the minors of (132) of order p= 1, 2, 
3,..., m are positive. 

We shall now show that this property also holds for every intermediate 
value of t (a<t<y). 

For p=1, this is obvious. Let us prove the statement for the minors 
of order p, on the assumption that it is true for those of order p-—-1. We 
consider an arbitrary minor of order p formed from successive rows and 
columns of (132) : 


pea et, See et [g=0,1,...,2(m—-p) +1]. (138) 


oo eoeee@8 fe @8emhUcOMmmUMOMUMOUMOH UO 


We compute the derivative of this minor 


-1 
*, OD? es, 


on £ De = p>; 


134 
foo Sincere Cs 


a D® 


7 ers (4,k=0,1,..., p—1) are the algebraic complements (adjoints) 
So+it+k 


of the elements of aay. Since by assumption all the minors of this determi- 
nant are positive, we have 


ap : 
SU tang (3, k= 0, 1, ve ey »—1). (135) 


On the other hand, we find from (131) : 


944 XV. THe PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS 


(-Ayetite Basses _ ottse > 9 (i, k=0,1,... , p—1). (136 
From (134), (135), and (136) it follows that 


q=0,1,... ,2(m—p)+1, 
fe 1 2 D® =0 p=1,2,... ,™, (137) 
-astsy 


Thus, when the argument increases from t = a, to t= y, then every minor 
(133) with even g is a monotone non-decreasing function and with odd q 
is a monotone non-increasing function; but since the minor is positive for 
t=a and t=y, it is also positive for every intermediate value of ft 
(a<t<y). . 

From the fact that the minors of (132) of order p — 1 and those of order 
p that are formed from successive rows and columns are positive, it now 
follows that ail the minors of (131) of order p are positive.*’ 

What we have proved implies that for every ¢t (at y) the values 
So, S1, -++, Som—1 and (for n=2m-+1) s_, satisfy the inequalities (115) 
and (116), i.e., that for every ¢ these values are the Markov parameters of 
a certain Hurwitz polynomial. In other words, the whole are (131) and, 
in particular, the point Q les in the domain of stability G. 

This completes the proof of Markov’s Theorem. 


Note. Since we have proved that every point of the are (131) belongs 
to G, the values of (131) for every ¢t (a S¢ Sy) determine a totally posi- 
tive matrix S(t) = 1 Se44(t) le of rank m. Therefore the inequalities (135) 
and consequently (137) as well hold for every t (aS t= y), ie., with in- 
creasing t every D™ increases for even g and decreases for odd gq (¢q=0, 1, 
2,...,2(m—p)+1;p=1,...,m). In other words, from PX Q~<,R 
it follows that | y 


(— 1) DP) S (— 1)? DP(Q) S (— 1)? DO(R) 
(q=0, 1, ..., 2(m—p)+1; p=1, ..., m). 


These inequalities for g=0, 1 give Markov’s inequalities (pp. 241). 
We now come to the Chebyshev-Markov theorem mentioned at the be- 
ginning of this section. Again we quote from Markov’s memoir :®* 


87 This follows from Fekete’s determinant indentity (see [17], pp. 306-7). 
88 See [38], p. 103, beginning with line 5. 


§ 17, THEOREMS oF Markoy anp CHEBYSBEV 245 
THEOREM ON Roots: If the numbers 


Qo; a, aq, a9 Qom—a» Qem—1) 
Sq, Sy, Sq, ---; Som—a» Som—1> 
by, by, bg, -- 


satisfy all the conditions of the preceding theorem,* then the 
equations 


°9 bom—23 bem—1 


Gg Gg +++ Amy, 2 | =O, 


e «© oe @ @®© $@® @®  @  @ @ 


84 %q 8m x 
8m, Sm+1 Som—1 wv 


bz bs bray oF | =O 
On Ont Dom—1 i 


of degree mim the unknown x do not have multiple or imaginary 
or negative roots. 


And the roots of the second equation are larger than the cor- 
responding roots of the first equation and smaller than the cor- 
responding roots of the last equation. 


Let us find out the connection of this theorem with the domain of sta- 


bility in the space of the Markov parameters. Setting f(z) =h(z?) + zg(z*) 
and 


h (— v) = cgu™ + eyu™* + +++ Cm (C90), 


we obtain from the expansion (105) 


the identity 


89 He refers to the preceding theorem, Markov’s theorem on determinants (pp. 241). 


2946 XV. THE PROBLEM OF RouTH-HuRWITZ aND RELATED QUESTIONS 


—9(-) =(-s + 24+4 Sa +) (cov™ + CU"—? + 08+ + Cy). 


Equating to zero the coefficients of the powers v—', v—?,..., u-™, we find: 
80Cm + 816m—1 + ied + Elo = 0, 
810m + 89m Se 2 8m1 fo — 0, (138) 


eo ef e@ e© © @ @© ©  @© @©  @© @  @  @ @  @ 


8m—1"m + Smlm—1 + ++° + San_109 —0; 


to these relations we add the equation 


h(—v)=0, (139) 
written as 
Cm + Up, _y ooo + Uy = 9. (139’) 


Eliminating from (138) and (139’) the coefficients Co, ci, ..., Cm, we repre- 
sent the equation (139) in the form 


8 8 eee 8n—1 l 
8} 83 oee 8, v 
= ” 
™ 
8m 8mi1 +++ Sam—1  ¥ 


Thus, the algebraic equation in the Chebyshev-Markov theorem coincides 
with (139) and the inequalities imposed on 59, 81, ..., Sem—1 coincide with 
the inequalities (115) that determine the domain of stability in the space 
of the Markov parameters. 

The Chebyshev-Markov theorem shows how the roots uw =— 1, 
Ue = — Vo, ..., Um = — Um Of h(u) change when the corresponding Markov 
parameters So, 51, ..., Sem—1 Vary in the domain of stability. 

The first part of the theorem states something we already know: When 
the inequalities (115) are satisfied, then all the roots u1, ue, ..., um of h(x) 
are simple, real, and negative.°° We denote them as follows: 


Uy (P), ue(P),..., um(P), 


where P is the corresponding point of G. 
The second (fundamental) part of the Chebyshev-Markov theorem can 


be stated as follows: 


90 See Theorem 13, on p. 228. 


§ 17: THEOREMS oF Markov AND CHEBYSHEV 247 


THEOREM 22 (Chebyshev-Markov): If P and Q are two points of G and 
P ‘precedes’ Q, 
P~<Q, (140) 
then™ 


uy(P) < u1(Q), u2e(P) < Ue(Q),..., Um(P) << Um(Q). (141) 


Proof. The coefficients of h(u) can be expressed rationally in terms of 
the parameters So, $1, ..., Sem—1."" Then 


h (u,)=90 («=1, 2, ..., m) 


implies that :** 
h 


T=? (i= 1, 2, ..., m; 1=0,1,..., 2m—1). (142) 


dh a 


+h’ (u;) q 
a the other hand, when we differentiate the expansion 
gu), 5 yh 
h (u) 
term by term with respect to s, we find: 
ies 


i (u) Se — gu) § nae, i 
“) — “yltt + u2zmti (*). (143) 


Multiplying both sides of this equation by ““). and denoting the coeffi- 


cient of uw! in this polynomial by Cy, we obtain: 


Pe sad Oh (u) 
h(u) dg (u) ds, — (—1)'Cx 
uU— hy 08; ia u— U; 7 ) J ie a (144) 


Comparing the coefficients of 1/u (the residues) on the two sides of (144), 


we find: 
Oh (uj; 
(— 1) 9 (a) = Cy, (145) 


which gives in conjunction with (142) : 
ey ON hd Hk 
ds, g (us) h’ (us) © 
Jee 
91 In other words, the roots w, uz, ..., um increase with increasing Sv, Sz, ..., Sen—2 and 


with decreasing &1, 8, ..., Ssm—1, 
92 For example, by the equations (138) if, for simplicity, we set co—= 1 in these equations 
Oh = } =|" (u) 


93 Here . 


248 XV. Tie PRoBLEM oF RouTH-HURWITZ AND RELATED QUESTIONS 


Introducing the va'‘ues 


R,= an (I=1, 2, ..., m), (146) 


we obtain the formula of Chebyshev-Markov: 


ot = a (j=1, 2, ...,m; 1=0,1,..., 2m—1). (147) 


But in the domain of stability the values R, (c=1, 2,..., m) are positive 
‘(see (90’) on p. 226). The same can be said of the coefficients Cy. For 

POD oh (wt vy)Pe + (wr Ging)? (+O) (UF Hees) (UE MR), (148) 
where 


z= —4,>0 (= 1, 2, ..., m), 

h? (u) 
— Wi 

in powers of u are positive. Thus, we obtain from the Chebyshev-Markov 

formula: 


From (148) it is clear that all the coefficients Cy in the expansion of = 


(— 1 iy a 0. (149) 


In the proof of Markov’s theorem we have shown that any two points 
P ~< Q of G can be joined by an are s;= (—- 1)!gi (t) (L=0,1,..., 2m —1), 
where g(t) is a monotonic increasing differentiable function of ¢ (¢ varies 
within the limits a and § (a < #) and t= a corresponds to P, t= to Q). 
Then along this arc we have, by (149) :* 


dug __ -"> * dug > 0, ah £0 (ests). (150) 
t=O 


Hence by integrating we obtain: 
Us (ray = Ui (P) < Vig gy = % (Q) (@=1, 2, ..., m). 


This completes the proof of the Chebyshev-Markov theorem. 


§ 18. The Generalized Routh-Hurwitz Problem 


I. In this section we shall give a rule to determine the numher of roots in the 
right half-plane of a polynomial f(z) with complex coefficients. 


ds ay 
®4 Since (— 1)? +z = = +2 20 (easts p) and for at least one J there exist values 


de 
of ¢ for which (— 1) 5; > 0. 


§ 18. THE GENERALIZED RoutH-Hurwitz Propuem 249 


Suppose that 
f (tz) = baz” + by2"-1 + +--+ 46, +4 (age™ + ay2"-14--+-4-a,), (151) 


where do, G1, ..-, Gn, Bo, 01, ..., bn are real numbers. If the degree of f(z) 
is nm, then bo t+ ta) 0. Without loss of generality we may assume that 
Gy) + 0 (otherwise we could replace f(z) by if(z)). 

We shall assume that the real polynomials 


Go2” + ayz"-1+---+a, and bg" + by2"-1'+--+4+ 8, (152) 


are co-prime, ie., that their resultant does not vanish :°° 


~0. (153) 


e ef @ o@ ©  @  @ @  @  @®  @®  ©@ 


Hence it follows, in particular, that the polynomials (152) have no roots in 
common and that, therefore, f(z) has no roots on the imaginary axis. 

We denote by & the number of roots of f(z) with positive real parts. 
By considering the domain in the right half-plane bounded by the imaginary 
axis and the semi-circle of radius R (R— o) and by repeating verbatim 
the arguments used on p. 177 for the real polynomial f(z), we obtain the 
formula for the increment of arg f(z) along the imaginary axis 


At@ arg f(z) = (n — 2k) x. (154) 
Hence we obtain, by (151), in view of a 540: 


pin bt be gop 


20 Az” + GyZ"—) 4 +++ + By (155) 
Using Theorem 10 of § 11 (p. 215), we now obtain: 
b= Vs Ve 60333 Vg) (156) 


where 


65 7, is a determinant of order 2n. 


250 XV. Tue PROBLEM OF RoutH-I{URWITZ AND RELATED QUESTIONS 


V,= Od --- Mp2) (yp =1, 2, ..., n; = b,=0 for k>n). (157) 


exe e¢ ee @® @  @ @ 


We have thus arrived at the following theorem. 


THEOREM 23: If a complex polynomial f(z) is given for which 
f (iz) = dg + Be +B F i (age Faz" +---4+4,) (dy 0) 


and if the polynomials agz"+...+ 0, and boz"+...+b, are co-prime 
(V 5,40), then the number of roots of f(z) mm the right half-plane ts deter- 
mined by the formulas (156) and (157). 
Moreover, if some of the determinants (157) vanish, then for each group 
of successive zeros 
(V 0,540) Vanpa="-°=Voanrap =9 Vensrepre #9) (158) 
in the calculation of V(1,V2,V4,.--,Von) we must set: 


sign V o442; =(— 1) sign V,, (G=1, 2, ..., p) (159) 


or, what 1s the same, 


V (Vans Vensar - ++ V ante V on+ap+2) 
a for odd p, 


p+1— 
2 


; P (160) 
for even p and e =(—1)? sign 


V aatapts 
Vor 
We leave it to the reader to verify that in the special case where f(z) 

is a real polynomial we can obtain the Routh-Hurwitz theorem (see § 6) 

from Theorem 23.” 

In conclusion, we mention that in this chapter we have dealt with the 
application of quadratic forms (in particular, Hankel forms) to one problem 
of the disposition of the roots of a polynomial in the complex plane. Quad- 
ratic and hermitian forms also have interesting applications to other prob- 
lems of the disposition of roots. We refer the reader who is interested in 
these questions to the survey, already quoted, of M. G. Krein and M. A. 
Naimark ‘The method of symmetric and hermitian forms in the theory 
of separation of roots of algebraic equations,’ (Kharkov, 1936). 


96 Suitable algorithms for the solution of the generalized Routh-Hurwitz problem zan 
be found in the monograph [41] and in the paper [39]. See also [7] and [37]. 


INDEX 


INDEX 


[Numbers in italics refer to Volume Two] 


ABSOLUTE CONCEPTS, 184 

Addition of congruences, 182 

Addition of operators, 57 

Adjoint matrix, 82 

Adjoint operator, 265 

Algebra, 17 

Algorithm of Gauss, 23ff. 

generalized, 45 

Angle between vectors, 242 

Axes, principal, 309 
reduction to, 409 


Basis(£s), 51 
characteristic, 73 
coordinates of vector in, 53 
Jordan, 201 
lower, 202 
orthonormal, 242, 245 
Bessel, inequality of, 259 
Bézout, generalized theorem of, 81 
Binet-Cauchy formula, 9 
Birkhoff, G. D., 147 
Block, of matrix, 41 
diagonal, isolated, 75 
Jordan, 151 
Block multiplication of matrices, 42 
Bundle of vectors, 183 
Bunyakovskii’s inequality, 255 


CAETAN, theorem of, 4 

Cauchy, formula of Binet-, 9 
system of, 115 

Cauchy identity, 10 

Cauchy index, 174, 216 

Cayley, formulas of, 279 

Cayley-Hamilton theorem, 83, 197 

Cell, of matrix, 41 

Chain, see Jordan, Markov, Sturm 

Characteristic basis, 73 

Characteristic direction, 71 

Characteristic equation, 70, 310, 338 

Characteristic matrix, 82 

Characteristic polynomial, 71, 82 


Characterization of root, minimal, 319 
maximal-minimal, 321, 322 
Chebyshev, 178, 240 
polynomials of, 259 
Chebyshev-Markov, formula of, 248 
theorem of, 247 
Chetaev, 121 
Chipart, 173, 221 
Coefficients of Fourier, 261 
Coefficients of ibfluence, reduced, 111 
Column, principal, 338 
Column matrix, 2 
Columns, Jordan chains of, 165 
Components, of matrix, 105 
of operator, hermitian, 268 
skew-symmetric, 281 
symmetric, 281 
Compound matrix, 19ff., 20 
Computation of powers of matrix, 109 
Congruences, 181, 182 
Constraint, 320 
Convergence, 110, 112 
Coordinates, transformation of, 59 
of vector, 53 
Coordinate transformation, matrix of, 60 


D’ALEMBERT-EULER, theorem of, 286 
Danilevskii, 214 
Decomposition, of matrix into triangular 
factors, 33ff. 
polar, of operator, 276, 286; 6 
of space, 248 
Defect of vector space, 64 
Derivative, multiplicative, 133 
Determinant identity of Sylvester, 32, 33 
Determinant of square matrix, 1 
Diagonal matrix, 3 
Dilatation of space, 287 
Dimension, of matrix, 1 
of vector space, 51 
Direction, characteristic, 71 
Discriminant of form, 333 


254 


Divisors, elementary, 142, 144, 194 
admissible, 238 
geometrical theory of, 175 
infinite, 27 

Dmnitriev, 87 

Domain of stability, 232 

Dynkin, 8&7 


EIGENVALUE, 69 

Elements of matrix, 1 

Elimination method of Gauss, 23ff. 

Equivalence, of matrices, 61, 132, 133 
of pencils, strict, 24 


Ergodic theorem for Markov chains, 95- 


Erugin, theorem of, 122 
Ewer-D’Alembert, theorem of, 286 


FACTOR SPACE, 183 
Faddeev, method of, 87 
Field, 1 
Forces, linear superposition of, 28 
Form, bilinear, 294 
Hankel, 338; 205 
hermitian, 244, 331 
bilinear, 332 
canonical form of, 337 
negative definite, 337 
negative semidefinite, 336 
pencil of, see pencil 
positive definite, 337 
positive semidefinite, 336 
rank of, 333 
signature of, 334 
singular, 333 
quadratic, 246, 294 
definite, 305 
discriminant of, 294 
rank of, 296 
real, 294 
reduction of, 299ff. 
reduction to principal axes, 309 
restricted, 306. 
semidefinite, 304 
signature of, 296, 298 
singular, 294 
Fourier series, 261 
Frobenius, 304, 339, 343; 53 
theorem of, 343; 53 
Function, entire, 169 
left value of, 81 


GANTMACHER, 103 
Gauss, algorithm of, 23 ff. 
generalized, 45 
elimination method of, 23ff. 


INDEX 


Gaussian form of matrix, 39 
Golubchikov, 124 
Governors, 172, 283 
Gram, criterion of, 247 
Gramian, 247, 251 
Group, 18 

unitary, 268 
Gundenfinger, 304 


HADAMARD INEQUALITY, 252 
generalized, 254 

Hamilton-Cayley theorem, 83, 197 

Hankel form, 338; 205 

Hankel matrix, 338; 205 

Hermite, 172, 202, 210 

Hermite-Biehler theorem, 228 

Hurwitz, 173, 190, 210 

Hurwitz matrix, 190 

Hyperlogarithm, 169 


IDENTITY OPERATOR, 66 

Imprimitivity, index of, 80 

Ince, 147 

Inertia, law of, 297, 334 

Integral, multiplicative, 132. 138 
product, 132 

Invariant plane, of operator, 283 


JACOBI, formula of, 302, 336 
identity of, 114 
method of, 300 
theorem of, 303 
Jacobi matrix, 99 
Jordan basis, 201 
Jordan block, 151 
Jordan chains of columns, 165 
Jordan form of matrix, 152, 201, 202 
Jordan matrix, 152, 201 


KARPELEVICH, 87 
Kernel of A-matrix, 39 
Kolmogorov, 83, 87, 92 
Kotelyanskii, 103 
lemma of, 71 
Krein, 221, 250 
Kronecker, 75; 25, 37. 40 
Krylov, 203 
transformation of, 206 


LAGRANGE, method of, 299 
Lagrange interpolation polynomial, 10 
Lagrange- Sylvester interpolation pol: 
mia], ©” 
\amatrix, 180 
kernel of. 39 


INDEX 255 


Lappo-Danilevskii, 168, 170, 171 
Left value, 81 
Legendre polynomials, 258 
Liénard, 173, 221 
Liénard-Chipart stability eriberion;: 221 
Limit of sequence of matrices, 33 
Linear (in)dependence of vectors, 51 
Linear transformation, 3 
"Logarithm of matrix, 239 
Lyapunov, 173, 185 
criterion of, 120 
equivalence in the sense of, 118 
theorem of, 187 
Lyapunov matrix, 117 
Lyapunov transformation, 117 


MacMILuian, 115 
Mapping, affine, 245 
Markov, 178, 240 
theorem of, 242 
Markov chain, acyclic, 8& 
cyclic, 88 
fully regular, 88 
homogeneous, 83 
period of, 96 
(ir) reducible; 38 
regular, 88 
Markov parameters, 283, 234 
Matricant, 127 
Matrices, addition of, 4 
group property, 18 
annihilating polynomial of, 89 
applications to differential equations, 
116ff. 
congruence of, 296 
difference of, 5 
equivalence of, 132, 133 
equivalent, 61ff. 
left-equivalence of, 132, 133 
limit of sequence of, 33 
multiplication on left by H, 14 
product of, 6 
quotient of, 17 
rank of product, 12 
similarity of, 67 
unitary similarity of, 242 
with same real part of spectrum, 122 
adjoint, 82, 266 
reduced, 90 
blocks of, 41 
canonical form of, 63, 135, 136, 139, 141, 
152, 192, 201, 202, 264, 265 
cells of, 41 
characteristic, 82 
characteristic polynomial of, 82 


Matrix, column, 2 
commuting, 7 
companion, 149 
completely reducible, &1 
complex, iff. 
orthogonal, normal form of, 23 
representation of as product, 6 
skew-symmetric, normal form of, 18 
symmetric, normal form of, 11 
components of, 105 
compound, 19ff., 20 
computation of power of, 109 
constituent, 105 
of coordinate transformation, 60 
cyclic form of, 54 
decomposition into triangular factors, 
33ff. 
derivative of, 117 
determinant of, 1, 5 
diagonal, 3 
multiplication by, 8 
diagonal form of, 152 
dimension of, 1 
elementary, 132 
elementary divisors of, 142, 144, 194 
elements of, 1 
function of, 95ff. 
defined on spectrum, 96 
fundamental, 73 
Gaussian form of, 39 
Hankel, 338; 205 
projective, 20 
Hurwitz, 190 
idempotent, 226 
infinite, rank of, 239 
integral, 126; 113 
normalized, 114 
invariant polynomials of, 139, 144, 194 
inverse of, 15 
minors of, 19ff. 
irreducible, 50 
(im ) primitive, 80 
Jacobi, 99 
Jordan form of, 152, 201, 202 
A,, 130 
and linear operator, 56 
logarithm of, 239 
Lyapunov, 117 
minimal polynomial of, 89 
uniqueness of, 90 
minor of, 2 
principal, 2 
multiplication of, by number, 5 
by matrix, 17 


256 


Matrix, nilpotent, 226 
non-negative, 50 
totally, 98 
non-singular, 15 
normal, 269 
norma] form of, 150, 192, 201, 202 
notation for, 1 
order of, 1 
orthogonal, 263 
oscillatory, 103 
partitioned, 41, 42 
permutable, 7 
permutation of, 50 
polynomial, see polynomial matrix 
polynomials in, permutability of, 13 
positive, 50 
spectra of, 58 6 
totally, 98 
power of, 12 
computation of, 109 
power series in, 113 
principal minor of, 2 
quasi-triangular, 43 
rank of, 2 
reducible, 50, 51 
normal form of, 75 
representation as product, 264 
root of non-singular, 233 
root of singular, 234ff., 239 
Routh, 191 
tow, 2 
of simple structure, 73 
singular, 15 
skew-symmetric, 19 
square, 1 
square root of, 239 
stochastic, 83 
fully regular, 88 
regular, 88 
spur of, 87 
subdiagonal of, 13 
superdiagonal of, 13 
symmetric, 19 
trace of, 87 
transformation of coordinate, 60 
transforming, 35, 60 
transpose of, 19 
triangular, 18, 218; 155 
unit, 12 
unitary, 263, 269 
unitary, representation of as product, 5 
upper quasi-triangular, 43 
upper triangular, 18 


Matrix addition, properties of, 4 


INDEX 


Matrix equations, 215ff. 
uniqueness of solution, 16 
Matrix multiplication, 6, 7 
Matrix polynomials, 76 
left quotient of, 78 
multiplication of, 77 
Maxwell, 172 
Mean, convergence in, of series, 260 
Metrie, 242 
euclidean, 245 
hermitian, 243, 244 
pox:.tive definite, 243 
positive semidefinite, 243 
Minimal indices for columns, 38 
Minor, 2 
almost principal, 102 
of zero density, 104 
Modulus, left, 275 
Moments, problem of, 286, 287 
Motion, of mechanical system, 125 
of point, 121 
stability of, 125 
asymptotic, 125 


NaIMARK, 221, 238, 250 
Nilpotency, index of, 226 
Norm, left, 275 

of vector, 243 
Null vector, 52 
Nullity of vector space, 64 
Number space, n-dimensional, 52 


OPERATIONS, elementary, 134 
Operator (linear), 55, 66 
adjoint, 265 
decomposition of, 281 
hermitian, 268 
positive definite, 274 
positive semidefinite, 274 
projective, 20 
spectrum of, 272 
identity, 66 
invariant plane of, 283 
matrix corresponding to, 56 
normal, 268 
positive definite, 280 
positive semidefinite, 280 
normal, 280 
orthogonal, of first kind, 281 
(im) proper, 281 
of second kind, 281 
polar decomposition of, 276, 286 
real, 282 
semidefinite, 274, 280 


Operator (linear), of simple structure, 72 


skew-symmetric, 280 

square root of, 275 

symmetric, 280 

transposed, 280 

unitary, 268 

‘ gpectrum of, 273 
Operators, addition of, 57 

multiplictaion of, 58 
Order of matrix, 1 
Orlando, formula of, 196 
Orthogonal complement, 266 
Orthogonalization, 256 
Oscillations, small, of system, 326 


PARAMETERS, homogeneous, 26 
Markov, 283, 294 . 

Parseval, equality of, 261 

Peano, 127 

Pencil of hermitian forms, 338 
characteristic equation of, 338 
characteristic values of, 338 
principal vector of, 338 


Pencil(s) of matrices, canonical form of, 


87, 39 
congruent, 41 
elementary divisors of, infinite, 27 
rank of, 29 
regular, 25 
singular, 25 
strict equivalence of, 24 
Pencil of quadratic forms, 310 
characteristic equation of, 310 
characteristic value of, 310 
principal column of, 310 
principal matrix of, 312 
principal vector of, 310 
Period, of Markov chain, 96 
Permutation of matrix, 50 
Perron, &3 
formula of, 116 
Petrovskii, 113 
Polynomial(s), annihilating, 176, 177 
minimal, 176 
of square matrix, 89 
of Chebyshev, 259 
characteristic, 71 
interpolation, 97, 101, 103 
invariant, 139, 144, 194 
of Legendre, 258 
matrix, see matrix polynomials 
minimal, 89, 176, 177 
monic, 176 
scalar, 76 
positive pair of, 227 


INDEX 


Polynomial matrix, 76, 130 
elementary operations on, 130, 131 
regular, 76 
order of, 76 

Power of matrix, 12 

Probability, absolute, 93 

limiting, 94 

mean limiting, 96 
transition, 82 

final, 88 

limiting, 88 

mean limiting, 96 

Product, inner, of vectors, 243 
scalar, of vectors, 242, 243 
of operators, 58 
of sequences, 6 

Pythagoras, theorem of, 244 


QUASI-ERGODIC THEOREM, 95 
Quasi-triangular matrix, 43 
Quotients of matrices, 17 


RANE, of infinite matrix, 239 
of matrix, 2 
of pencil, 29 
of vector space, 64 
Relative concepts, 184 
Right value, 81 
Ring, 17 
Romanovskii, 83 
Root of matrix, 233, 234ff., 239 
Rotation of space, 287 
Routh, 178, 201 
criterion of, 180 
Routh-Hurwitz, criterion of, 194 
Routh matrix, 191 
Routh scheme, 179 
Row matrix, 2 


SCHLESINGER, 133 
Schur, formulas of, 46 
Schwarz, inequality of, 255 
Sequence of vectors, 256, 260 
Series, convergence of, 260 
fundamental, of solutions, 38 
Signature of quadratic form, 296, 298 
Similarity of matrices, 67 
Singularity, 143 
Smirnov, 171 
Space, coefficient, 232 
decomposition of, 177, 248 
dilatation of, 287 
euclidean, 242, 245 
extension of, to unitary space, 282 
factor, 183 


257 


258 INDEX 


Space, rotation of, 287 UNIT SUM OF SQUARES, 314 
unitary, 242, 243 Unit sphere, 315 
as extension of euclidean, 282 Unit vector, 244 
Spectrum, 96, 272, 273; 53 
Spur, 87 VaLUE(S), characteristic, maximal, 53 
Square(s), independent, 297 extremal properties of, 317 
positive, 334 latent, 69 
Stability, criterion of, 221 left and right, of function, 81 
domain of, 232 proper, 69 
of motion, 125 Vector(s), 51 
of solnution of linear system, 129 angle between, 242 
States, essential, 92 bundle of, 183 
limiting, 92 Jordan chain of, 202 
non-essential, 92 complex, 282 
Stieltjes, theorem of, 232 congruence of, 181 
Stodol, 173 extremal, 55 
Sturm, theorem of, 175 inner product of, 243 
Sturm chain, 175 Jordan chain of, 201 
generalized, 176 latent, 69 
Subdiagonal, 13 length of, 242, 243 
Subspace, characteristic, 71 linear dependence of, 51 
coordinate, 51 test for, 251 
cyclic, 185 modulo 7, 183 
generated by vector, 185 linear independencé of, 51 
invariant, 178 norm of, 243 
vector, 63 normalized, 244; 66 
Substitution, integral, 148, 169 null, 52 
Suleimanova, 87 orthogonal, 244, 248 
Superdiagonal, 13 orthogonalization of sequence, 256 
Sylvester, identity of, 32, 33 principal, 318, 338 
inequality of, 66 proper, 69 
Systems of differential equations, applica- projecting, 248 
tion of matrices to, 116ff. projection of, orthogonal, 248 
equivalent, 118 real, 282 
reducible, 118 scalar product of, 242, 243 
regular, 121, 168 systems of, bi-orthogonal, 267 
singularity of, 148 orthonormal, 245 
stability of solution, 129 unit, 244 
Systems of vectors, bi-orthogonal, 267 Vector space, 50ff., 51 
orthonormal, 245 basis of, 51 
defect of, 64 
TRACE, 87 dimension of, 51 
Transformation, lirear, 3 finite-dimensional, 51 
of coordinates, 59 infinite-dimensional, 51 
orthogonal, 242, 263 nullity of, 64 
unitary, 242, 263 rank of, 64 
written as matrix equation, 7 Vector, subspace, 63 
Lyapunov, 117 Volterra, 183, 145, 147 
Transforming matrix, 35, 60 Vyshnegradskii, 172 


Transpose, 19, 280 
Transposition, 18 WEIERSTRASS, 25 


